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5' NUCLEASES DERIVED FROM THERMOSTABLE DNA POLYMERASE 



This is a Continuation-In-Part Application of Application Serial 
No. 08/073,384, filed June 4, 1993, which is a Continuation-In-Part Application of 
Application Serial No. 07/986,330, filed December 12, 1992. 



FIELD OF THE INVENTION 

The present invention relates to means for cleaving a nucleic acid cleavage 
structure in a site-specific manner. In particular, the present invention relates to a 
cleaving enzyme having 5' nuclease activity without interfering nucleic acid 
synthetic ability. 



10 BACKGROUND OF THE INVENTION 

The detection of specific nucleic acid sequences has been utilized to 
diagnose the presence of viral or bacterial nucleic acid sequences indicative of an 
infection, the presence of variants or alleles of mammalian genes associated with 
disease and the identification of the source of nucleic acids found in forensic 

15 samples and in paternity determinations. 

The detection of specific nucleic acid sequences has been achieved typically 
by hybridization. Hybridization methods involve the annealing of a complementary 
sequence to the target nucleic acid (the sequence to be detected). The ability of 
two polymers of nucleic acid containing complementary sequences to find each 

20 other and anneal through base pairing interaction is a well-recognized phenomenon. 

The initial observations of the "hybridization" process by Marmur and Lane, Proc. 
Natl. Acad. Sci. USA 46:453 (1.960) and Doty et ah, Proc. Natl Acad. Set USA 
46:461 (1960) have been followed by the refinement of this process into an 
essential tool of modern biology. 

25 Initial hybridization studies, such as those performed by Hayashi et ah , 

Proc. Natl. Acad. Sci. USA 50:664 (1963), were formed in solution. Further 
development led to the immobilization of the target DNA or RNA on solid 
supports. With the discovery of specific restriction endonucleases by Smith and 
Wilcox, J. Moh Bioh 51:379 (1970), it became possible to isolate discrete 
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fragments of DNA. Utilization of immobilization techniques, such as those 
described by Southern, J. Mol Biol 98:503 (1975), in combination with restriction 
enzymes, has allowed for the identification by hybridization of single copy genes 
among a mass of fractionated, genomic DNA. 

In spite of the progress made in hybridization methodology, a number of 
problems have prevented the wide scale use of hybridization as a tool in human 
diagnostics. Among the more formidable problems are: 1) the inefficiency of 
hybridization; 2) the low concentration of specific target sequences in a mixture of 
genomic DNA; and 3) the hybridization of only partially complementary probes 
and targets. 

1. Inefficient Hybridization 

It is experimentally observed that only a fraction of the possible number of 
probe-target complexes are formed in a hybridization reaction. This is particularly 
true with short oligonucleotide probes (less than 100 bases in length). There are 
three fundamental causes: a) hybridization cannot occur because of secondary and 
tertiary structure interactions; b) strands of DNA containing the target sequence 
have rehybridized (reannealed) to their complementary strand; and c) some target 
molecules are prevented from hybridization when they are used in hybridization 
formats that immobilize the target nucleic acids to a solid surface. 

Even where the sequence of a probe is completely complementary to the 
sequence of the target, z.e., the target's primary structure, the target sequence must 
be made accessible to the probe via rearrangements of higher-order structure. 
These higher-order structural rearrangements may concern either the secondary 
structure or tertiary structure of the molecule. Secondary structure is determined 
by intramolecular bonding. In the case of DNA or RNA targets this consists of 
hybridization within a single, continuous strand of bases (as opposed to 
hybridization between two different strands). Depending on the extent and position 
of intramolecular bonding, the probe can be displaced from the target sequence 
preventing hybridization. 

Solution hybridization of oligonucleotide probes to denatured double- 
stranded DNA is further complicated by the fact that the longer complementary 
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target strands can renature or reanneal. Again, hybridized probe is displaced by 
this process. This results in a low yield of hybridization (low "coverage") relative 
to the starting concentrations of probe and target. 

The immobilization of target nucleic acids to solid surfaces such as nylon or 
5 nitrocellulose is a common practice in molecular biology. Immobilization formats 

| eliminate the reassociation problem that can occur between complementary strands 

of target molecules, but not the problems associated with secondary structure 
effects. However, these mixed phase formats (i.e., Southern hybridization or dot 
blot hybridization) require time consuming fixation procedures. The hybridization 
1 0 reaction itself is kinetically much slower than a solution phase hybridization 

reaction. Together, the fixation and hybridization procedures require a minimum of 
several hours to several days to perform. Additionally, the standard immobilization 
procedures are often inefficient and result in the attachment of many of the target 
molecules to multiple portions on the solid surface, rendering them incapable of 
15 subsequent hybridization to probe molecules. Overall, these combined effects result 

in just a few percent of the initial target molecules being bound by probes in a 
hybridization reaction. 

2. Low Target Sequence Concentration 

In laboratory experiments, purified probes and targets are used. The 
20 concentrations of these probes and targets, moreover, can be adjusted according to 

the sensitivity required. By contrast, the goal in the application of hybridization to 
medical diagnostics is the detection of a target sequence from a mixture of genomic 
DNA. Usually the DNA fragment containing the target sequence is in relatively 
low abundance in genomic DNA. This presents great technical difficulties; most 
25 conventional methods that use oligonucleotide probes lack the sensitivity necessary 

to detect hybridization at such low levels. 

One attempt at a solution to the target sequence concentration problem is 
the amplification of the detection signal. Most often this entails placing one or 
more labels on an oligonucleotide probe. In the case of non-radioactive labels, 
30 even the highest affinity reagents have been found to be unsuitable for the detection 

of single copy genes in genomic DNA with oligonucleotide probes. See Wallace 
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et al., Biochimie 67:755 (1985). In the case of radioactive oligonucleotide probes, 
only extremely high specific activities are found to show satisfactory results. See 
Studencki and Wallace, DNA 3:1 (1984) and Studencki et al. Human Genetics 
37:42 (1985). 

5 Polymerase chain reaction (PCR) technology provides an alternate approach 

to the problems of low target sequence concentration. PCR can be used to directly 
increase the concentration of the target prior to hybridization. In U.S. Patents 
Nos. 4,683,195 and 4,683,202, Mullis et al describe a method for increasing the 
concentration of a segment of target sequence in a mixture of genomic DNA 

1 0 without cloning or purification. 

This process for amplifying the target sequence consists of introducing a 
molar excess of two oligonucleotide primers to the DNA mixture containing the 
desired target sequence. The two primers are complementary to their respective 
strands of the double-stranded sequence. The mixture is denatured and then 

15 allowed to hybridize. Following hybridization, the primers are extended with 

polymerase so as to form complementary strands. The steps of denaturation, 
hybridization, and polymerase extension can be repeated as often as needed to 
obtain relatively high concentration of a segment of the desired target sequence. 
The length of the segment of the desired target sequence is determined by the 

20 relative positions of the primers with respect to each other, and, therefore, this 

length is a controllable parameter. By virtue of the repeating aspect of the process, 
the method is referred to by the inventors as the "Polymerase Chain Reaction" (or 
PCR). Because the desired segment of the target sequence become the dominant 
sequences (in terms of concentration) in the mixture, they are said to be "PCR- 

25 amplified." 

However the PCR process is susceptible to the production of non-target 
fragments during the amplification process. Spurious extension of primers at 
partially complementary regions occurs during PCR reactions. Factors influencing 
the specificity of the amplification process include: a) the concentration of the 
30 target sequence in the DNA to be analyzed; b) the concentration of the Mg ++ , 

polymerase enzyme and primers; c) the number of cycles of amplification 
performed; and d) the temperatures and times used at the various steps in the * 
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amplification process [PCR Technology - Principles and Applications for DNA 
Amplification (H.A. Erlich, Ed.), Stockton Press, New York, pp. 7-16 (1989)]. 
When the specific target sequence is present in low concentration in the sample 
DNA more non-target fragments are produced. Low target concentration is often 
5 the norm in clinical samples where the target may be present as a single copy in 

the genome or where very little viral DNA is present as in HIV infections. 

Because amplification products are produced which do not represent the 
specific target sequence to be detected, the products of a PCR reaction must be 
analyzed using a probe specific for the target DNA. The detection of specific 
10 amplification products has been accomplished by the hybridization of a probe 

specific for the target sequence to the reaction products immobilized upon a solid 
support. Such a detection method is cumbersome and is subject to the same 
problems associated with the detection of any target molecule by hybridization as 
discussed above. 

15 A non-hybridization based detection assay for specific PCR products has 

been described by Holland et al 9 Proc. Natl Acad. Scl USA 88:7276 (1991). In 
this detection system, the 5' nuclease activity of wild type DNA polymerase from 
Thermits aquaticus ("DNAP7b# M ) is used to generate a specific detectable product 
concomitantly with amplification. An oligonucleotide probe specific for the target 

20 DNA is labeled on the 5 ' end and added to the PCR reaction along with the 

unlabelled primers used for extension of the target to be amplified. The 5' 
nuclease activity of the DNAP7a# cleaves the labeled probe annealed to the target 
DNA before the extension of the primer is complete, generating a smaller fragment 
of the probe. This detection system requires that amplification be performed upon 

25 the sample to produce the specific detection product. This is slow and requires 

cumbersome equipment. 

A minimum of 1 00 starting copies (z. e. , copy number prior to amplification) 
of target DNA were used in this detection system; it is not clear whether fewer 
starting copies of target DNA will yield detectable results using this method. Very 

30 low copy number may be a problem for some clinical samples where very little 

DNA is obtained due to restrictions on sample size (blood from neonates or fetuses, 
forensic samples, etc.). 
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While such an assay is an improvement over earlier hybridization detection 
methods, it still requires that a PCR reaction be performed upon the sample and it 
possesses certain inherent problems. One such problem is that this system requires 
that the detection probe must bind to the target DNA before primer extension 
5 occurs. If extension occurs first, the probe binding site will be unavailable and no 

digestion of the probe will occur and therefore no detectable signal will be 
produced. To overcome this problem the user must vary the relative amounts of 
primer and probe or manipulate the sequence and length of the probe. The need 
for such optimization may prove too burdensome for clinical laboratories. 



10 3. Partial Complementarity 

Hybridization, regardless of the method used, requires some degree of 
complementarity between the sequence being assayed (the target sequence) and the 
fragment of DNA used to perform the test (the probe). (Of course, one can obtain 
binding without any complementarity but this binding is nonspecific and to be 

1 5 avoided.) For many diagnostic applications, it is not important to determine 

whether the hybridization represents complete or partial complementarity. For 
example, where it is desired to detect simply the presence or absence of pathogen 
DNA (such as from a virus, bacterium, fungi, mycoplasma, protozoan) it is only 
important that the hybridization method ensures hybridization when the relevant 

20 sequence is present; conditions can be selected where both partially complementary 

probes and completely complementary probes will hybridize. Other diagnostic 
applications, however, may require that the method of hybridization distinguish 
between variant target sequences. For example, it may.be of interest that a 
particular allelic variant of a pathogen is present. These normal and variant 

25 sequences may differ in one or more bases. 

There are other applications that may require that the hybridization method 
distinguish between partial and complete complementarity. It may be of interest to 
detect genetic polymorphisms. Human hemoglobin is composed, in part, of four 
polypeptide chains. Two of these chains are identical chains of 141 amino acids 

30 (alpha chains) and two of these chains are identical chains of 146 amino acids (beta 

chains). The gene encoding the beta chain is known to exhibit polymorphism. The 
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normal allele encodes a beta chain having glutamic acid at the sixth position. The 
mutant allele encodes a beta chain having valine at the sixth position. This 
difference in amino acids has a profound (most profound when the individual is 
homozygous for the mutant allele) physiological impact known clinically as sickle 
5 cell anemia. It is well known that the genetic basis of the amino acid change 

involves a single base difference between the normal allele DNA sequence and the 
mutant allele DNA sequence. 

Unless combined with other techniques (such as restriction enzyme 
analysis), hybridization methods that allow for the same level of hybridization in 
1 0 the case of both partial as well as complete complementarity are unsuited for such 

applications; the probe will hybridize to both the normal and variant target 
sequence. 

Methods have been devised to enable discrimination between partial and 
complete complementarity. One approach is to take advantage of the temperature 

1 5 requirements of the specific hybridization under study. In typical melting curve 

experiments, such as those described by Wallace et al 9 Nucl Acids Res. 6:3543 
(1979) and Nucl Acids Res. 9:879 (1981), an immobilized probe-target complex is 
washed at increasing temperatures under non-equilibrium conditions. It is observed 
that partially complementary probe-target complexes display a lower thermal 

20 stability as compared to completely complementary probe-target complexes. This 

difference can be used, therefore, to determine whether the probe has hybridized to 
the partially complementary or the completely complementary target sequence. 

Conventional methods that utilize the temperature dependant nature of 
hybridization are artful. The application of this method for the discrimination of 

25 single base mutations in human genomic targets is limited to the use of short 

oligonucleotide probes where the hybridization interaction with the target sequence 
is in the size range of 17 bases to 25 bases in length. The lower length limit is 
determined by the random probability of having a complement to the probe in the 
human genome, which is greater than 1 for a random 1 6 base pair interaction, but 

30 less than 1 for interactions 17 bases or longer in length. The upper limit is one of 

practicality. It is difficult to differentiate single base mismatches on the basis of 
thermal stability for interactions longer than 25 bases in length. These 
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conventional methods are, unfortunately also time consuming. Probe 
concentrations in these experiments are approximately 1-5 x 10" 10 M. These 
concentrations are empirically derived; they minimize the use of probe and 
simultaneously provide sufficient discrimination to distinguish single copy genes 
5 utilizing probes of approximately 20 nucleotides in length. Hybridization times are 

two to ten hours at these concentrations. After hybridization, several washes of 
varying stringency are employed to remove excess probe, non-specifically bound 
probe, and probe bound to partially complementary sequences in the target genome. 
Careful control of these wash steps is necessary, since the signal (specifically 
10 bound probe) to noise (non-specifically bound probe) ratio of the experiment is 

ultimately determined by the wash procedures. 

No detection method heretofore described has solved all three of the 
problems discussed above. The PCR process solves the problem of low target 
concentration. However, the specific detection of PCR products by any 
15 hybridization method is subject to the same problems associated with the detection 

of any target molecules. The detection of single base differences between PCR 
targets was initially accomplished through the use of a restriction enzyme analysis 
of the hybridization completes formed between oligonucleotide probes and PCR 
targets. This technique is limited by that fact that restriction enzymes do not exist 
20 for all sequences. More recent studies have achieved discrimination without 

restriction enzymes, however these studies have involved the inefficient 
immobilization of target nucleic acids to solid surfaces [dot blot hybridization; 
Saiki et aL 9 Nature 324:163 (1986)]. 

Another method for the detection of allele-specific variants is disclosed by 
25 Kwok et al, Nucl Acids Res, 18:999 (1990). This method is based upon the fact 

that it is difficult for a DNAP to synthesize a DNA strand when there is a 
mismatch between the template strand and the primer. The mismatch acts to 
prevent the extension thereby preventing the amplification of a target DNA that is 
not perfectly complementary to the primer used in a PCR reaction. While an 
30 allele-specific variant may be detected by the use of a primer that is perfectly 

matched with only one of the possible alleles, this method of detection is artful and 
has limitations. Particularly troublesome is the fact that the base composition of 
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the mismatch influences the ability to prevent extension across the mismatch. 
Certain mismatches do not prevent extension or have only a minimal effect. 

An ideal method of detecting specific target DNAs would allow detection 
without the need to amplify the sample DNA first and would allow the detection of 
5 target sequences which are present in low copy numbers in the DNA sample. This 

ideal method would also allow the discrimination between variants of the target 
sequence such that single base variations between alleles of mammalian genes can 
be discerned. 

One object of the present invention is to provide a method of detection of 
10 specific nucleic acid sequences that solves the above-named problems. 

SUMMARY OF THE INVENTION 

The present invention relates to means for cleaving a nucleic acid cleavage 
structure in a site-specific manner. In one embodiment, the means for cleaving is a 
cleaving enzyme comprising 5' nucleases derived from thermostable DNA 

15 polymerases. These polymerases form the basis of a novel method of detection of 

specific nucleic acid sequences. The present invention contemplates use of the 
novel detection method for, among other uses, clinical diagnostic purposes. 

In one embodiment, the present invention contemplates a DNA sequence 
encoding a DNA polymerase altered in sequence (i.e., a "mutant" DNA 

20 polymerase) relative to the native sequence such that it exhibits altered DNA 

synthetic activity from that of the native (i.e., "wild type") DNA polymerase. It is 
preferred that the encoded DNA polymerase is altered such that it exhibits reduced 
synthetic activity from that of the native DNA polymerase. In this manner, the 
enzymes of the invention are predominantly 5 5 nucleases and are capable of 

25 cleaving nucleic acids in a structure-specific manner in the absence of interfering 

synthetic activity. 

Importantly, the 5' nucleases of the present invention are capable of 
cleaving linear duplex structures to create single discrete cleavage products. These 
linear structures are either 1) not cleaved by the wild type enzymes (to any 

30 significant degree), or 2) are cleaved by the wild type enzymes so as to create 

multiple products. This characteristic of the 5' nucleases has been found to be 
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consistent of enzymes derived in this manner from thermostable polymerases across 
eubacterial thermophilic species. 

It is not intended that the invention be limited by the nature of the alteration 
necessary to render the polymerase synthesis deficient nor the extent of the 
5 deficiency. The present invention contemplates altered structure (primary, 

secondary, etc.) as well as native structure inhibited by synthesis inhibitors. 

Where the structure is altered, it is not intended that the invention be limited 
by the means by which the structure of the polymerase is altered. In one 
embodiment, the alteration of the native DNA sequence comprises a change in a 
10 single nucleotide. In another embodiment, the alteration of the native DNA 

sequence comprises a deletion of one or more nucleotides. In yet another 
embodiment, the alteration of the native DNA sequence comprises an insertion of 
one or more nucleotides. In either of these cases, the change in DNA sequence 
may manifest itself in a change in amino acid sequence. 
15 The present invention contemplates 5' nucleases from a variety of sources. 

The preferred 5' nucleases are thermostable. Thermostable 5' nucleases are 
contemplated as particularly useful in that they operate at temperatures where 
nucleic acid hybridization is extremely specific, allowing for allele-specific 
detection (including single-base mismatches). In one embodiment, the thermostable 
20 5' nucleases are selected from the group consisting of altered polymerases derived 

from the native polymerases of Thermus aquaticus, Thermus flavus and Thermus 
thermophilus. 

As noted above, the present invention contemplates the use of altered 
polymerases in a detection method. In one embodiment, the present invention 
25 contemplates a method of detecting the presence of a specific target nucleic acid 

molecule comprising: a) providing: i) a cleavage means, ii) a target nucleic acid, 
iii) a first oligonucleotide complementary to a first portion of said target nucleic 
acid, iv) a first solid support having a second oligonucleotide, a region of which is 
complementary to a second portion of said target nucleic acid, said non- 
30 complementary region of said second oligonucleotide providing a single-stranded 

arm at its 5' end, a portion of said 5' arm comprising a first signal oligonucleotide, 
v) a plurality of "uncleaved" second solid supports each having a third 
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oligonucleotide, a region of which is complementary to said first signal 
oligonucleotide, the non-complementary region of said third oligonucleotide 
providing a single-stranded arm at its 5' end, a portion of said 5' arm comprising a 
second signal oligonucleotide, and vi) a plurality of "uncleaved" third solid supports 
5 each having a fourth oligonucleotide, a region of which is complementary to said 

second signal oligonucleotide, the non-complementary region of said fourth 
oligonucleotide providing a single-stranded arm at its 5 5 end, a portion of said 5' 
arm comprising said first signal oligonucleotide; b) mixing said cleavage means, 
said target nucleic acid, said first oligonucleotide and said second oligonucleotide 

1 0 under conditions wherein said first oligonucleotide and the 3 ' end of said second 

oligonucleotide are annealed to said target DNA sequence so as to create a first 
cleavage structure and cleavage of said first cleavage structure results in the 
liberating of said first signal oligonucleotide; d) reacting said liberated first signal 
oligonucleotide with one of said plurality of second solid supports under conditions 

1 5 such that said first signal oligonucleotide hybridizes to said complementary region 

of said third oligonucleotide to create a second cleavage structure and cleavage of 
said second cleavage structure results in the liberating of said second signal 
oligonucleotide and a "cleaved" second solid support; e) reacting said liberated 
second signal oligonucleotide with one of said plurality of third solid supports 

20 under conditions such that said second signal oligonucleotide hybridizes to said 

complementary region of said fourth oligonucleotide to create a third cleavage 
structure and cleavage of said third cleavage structure results in the liberating of a 
second molecule of said first signal oligonucleotide and a "cleaved" third solid 
support; and h) detecting the presence of said first and second signal 

25 oligonucleotides. 

It is preferred that, after the hybridization of said first signal oligonucleotide 
and liberation of said second signal oligonucleotide, said first signal oligonucleotide 
is itself released from said "cleaved" second solid support and reacted with one of 
said plurality of "uncleaved" second solid supports. Similarly, it is preferred that, 
30 after the hybridization of said second signal oligonucleotide and liberation of said 

second molecule of said first signal oligonucleotide, said second signal 
oligonucleotide is itself released from said "cleaved" third solid support and reacted 
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with on of said plurality of "uncleaved" third solid supports. By the term "cleaved' 1 
and "uncleaved" it is not meant to indicate that the solid support (e.g., a bead) is 
physically cleaved or uncleaved. Rather, it is meant to indicate the status of the 
oligonucleotide attached to the solid support. 
5 By reference to a "solid support" it is not intended that the invention be * 

limited to separate and discrete supports. For example, the invention contemplates 
a design where the oligos are on the same solid support, albeit separate in different 
regions. In one embodiment, the solid support is a microtiter well wherein the 
oligos are attached (e.g., covalently) or coated (e.g., non-covalently) in different 
10 regions of the well. 

In a second embodiment, the present invention contemplates a method of 
detecting the presence of a specific target nucleic acid molecule comprising: 
a) providing: i) a target nucleic acid, ii) a first oligonucleotide complementary to a 
first portion of said target nucleic acid, and iii) a second oligonucleotide, a region 

15 of which is complementary to a second portion of said target nucleic acid, said 

non-complementary region of said second oligonucleotide providing a single- 
stranded arm at its 5' end; b) mixing said target nucleic acid, said first 
oligonucleotide and said second oligonucleotide under conditions wherein said first 
oligonucleotide and the 3' end of said second oligonucleotide are annealed to said 

20 target DNA sequence so as to create a first cleavage structure; c) providing a 

cleavage means under conditions such that cleavage of said first cleavage structure 
occurs preferentially at a site located within said second oligonucleotide in a 
manner dependent upon the annealing of said first and second oligonucleotides on 
said target nucleic acid, thereby liberating the single-stranded arm of said second 

25 oligonucleotide generating a third oligonucleotide; d) providing a first hairpin 

structure having a single-stranded 3' arm and a single-stranded 5 5 arm under 
conditions wherein said third oligonucleotide anneals to said single-stranded 3 ' arm 
of said first hairpin thereby creating a second cleavage structure; e) providing 
conditions under which cleavage of said second cleavage structure occurs by said 

30 cleavage means liberating the single-stranded 5' arm of said second cleavage 

structure so as to create reaction products comprising a fourth oligonucleotide and a 
first cleaved hairpin detection molecule; f) providing a second hairpin structure 
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having a single-stranded 3' arm and a single-stranded 5' arm under conditions 
wherein said fourth oligonucleotide anneals to the single-stranded 3 ' arm of said 
second hairpin thereby creating a third cleavage structure; g) providing conditions 
under which cleavage of said third cleavage structure occurs by said cleavage 
means, liberating the single-stranded 5' arm of said third cleavage structure so as ti 
create reaction products comprising generating a fifth oligonucleotide identical in 
sequence to said third oligonucleotide and a second cleaved hairpin detection 
molecule; and h) detecting the presence of said first and second cleaved hairpin 
detection molecules. 

In one embodiment, the detection method of the present invention allows 
the detection of specific target nucleic acid sequences present in a sample without 
the need to amplify the number of target copies prior to detection. In this 
embodiment, steps d) through g) of the method are repeated at least once. 

In a preferred embodiment, the cleavage means comprises a cleavage 
enzyme comprising an altered thermostable DNA polymerase having reduced 
synthesis capability, i.e., a 5' nuclease derived from a thermostable DNA 
polymerase. While a complete absence of synthesis is not required, it is desired 
that cleavage reactions occur in the absence of polymerase activity at a level where 
it interferes with the discrimination needed for detection. 

While the cleavage of the second embodiment of the detection method of 
the present invention can be independent of the annealing of the oligonucleotides, it 
is preferred that the cleavage is primer-dependent. In other words, it is desired that 
the cleavage reactions of steps c), e) and g) will not occur absent the annealing of 
said first oligonucleotide, said third oligonucleotide and said fourth oligonucleotide, 
respectively. 

While cleavage is site-specific, the present invention allows for cleavage at 
a variety of sites. In one embodiment, the cleavage reaction of step c) occurs 
within the annealed portion of said second oligonucleotide. In another 
embodiment, the cleavage reaction of step c) occurs within the non-annealed 
portion of said second oligonucleotide. 
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DESCRIPTION OF THE DRAWINGS 

Figure 1A provides a schematic of one embodiment of the detection method 
of the present invention. 

Figure IB provides a schematic of a second embodiment of the detection 
5 method of the present invention. 

Figure 2 is a comparison of the nucleotide structure of the DNAP genes 
isolated from Thermus aquaticus, Thermus flavus and Thermus thermophilics; the 
consensus sequence is shown at the top of each row. 

Figure 3 is a comparison of the amino acid sequence of the DNAP isolated 
10 from Thermus aquaticus, Thermus flavus, and Thermus thermophilus; the consensus 

sequence is shown at the top of each row. 

Figures 4A-G are a set of diagrams of wild-type and synthesis-deficient 
DNAPTaq genes. 

Figure 5A depicts the wild-type Thermus flavus polymerase gene. 
1 5 Figure 5B depicts a synthesis-deficient Thermus flavus polymerase gene. 

Figure 6 depicts a structure which cannot be amplified using DNAP7a#. 

Figure 7 is a ethidium bromide-stained gel demonstrating attempts to 
amplify a bifurcated duplex using either DNAVTaq or DNAPStf. 

Figure 8 is an autoradiogram of a gel analyzing the cleavage of a bifurcated 
20 duplex by DNAVTaq and lack of cleavage by DNAPStf. 

Figures 9A-B are a set of autoradiograms of gels analyzing cleavage or lack 
of cleavage upon addition of different reaction components and change of 
incubation temperature during attempts to cleave a bifurcated duplex with 
DNAPTaq. 

25 Figures 10A-B are an autoradiogram displaying timed cleavage reactions, 

with and without primer. 

Figures 11A-B are a set of autoradiograms of gels demonstrating attempts to 
cleave a bifurcated duplex (with and without primer) with various DNAPs. 

Figures 12A shows the substrates and oligonucleotides used to test the 
30 specific cleavage of substrate DNAs targeted by pilot oligonucleotides. 

Figure 12B shows an autoradiogram of a gel showing the results of cleavage 
reactions using the substrates and oligonucleotides shown Fig. 12A. 
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Figure 13A shows the substrate and oligonucleotide used to test the specific 
cleavage of a substrate RNA targeted by a pilot oligonucleotide. 

Figure 13B shows an autoradiogram of a gel showing the results of a 
cleavage reaction using the substrate and oligonucleotide shown in Fig. 13 A. 
5 Figure 14 is a diagram of vector pTTQ18. 

Figure 15 is a diagram of vector pET-3c. 

Figure 16A-E depicts a set of molecules which are suitable substrates for 
cleavage by the 5' nuclease activity of DNAPs. 

Figure 1 7 is an autoradiogram of a gel showing the results of a cleavage 
10 reaction run with synthesis-deficient DNAPs. 

Figure 1 8 is an autoradiogram of a PEI chromatogram resolving the 
products of an assay for synthetic activity in synthesis-deficient DNAPTag clones. 

Figure 1 9A depicts the substrate molecule used to test the ability of 
synthesis-deficient DNAPs to cleave short hairpin structures. 
1 5 Figure 1 9B shows an autoradiogram of a gel resolving the products of a 

cleavage reaction run using the substrate shown in Fig. 19 A. 

Figure 20A shows the A- and T-hairpin molecules used in the 
trigger/detection assay. 

Figure 20B shows the sequence of the alpha primer used in the 
20 trigger/detection assay. 

Figure 20C shows the structure of the cleaved A- and T-hairpin molecules. 
Figure 20D depicts the complementarity between the A- and T-hairpin 
molecules. 

Figure 21 provides the complete 206-mer duplex sequence employed as a 
25 substrate for the 5' nucleases of the present invention 

Figures 22A and B show the cleavage of linear nucleic acid substrates 
(based on the 206-mer of Figure 21) by wild type DNAPs and 5' nucleases isolated 
from Thermus aquaticus and Thermus flavus. 

Figure 23 provides a detailed schematic corresponding to the of one 
30 embodiment of the detection method of the present invention. 

Figure 24 shows the propagation of cleavage of the linear duplex nucleic 
acid structures of Figure 23 by the 5' nucleases of the present invention. 
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Figure 25 A shows the "nibbling" phenomenon detected with the DNAPs of 
the present invention. 

Figure 25B shows that the "nibbling" of Figure 25A is 5' nucleolytic 
cleavage and not phosphatase cleavage. 
5 Figure 26 demonstrates that the "nibbling" phenomenon is duplex 

dependent. 

Figure 27 is a schematic showing how "nibbling" can be employed in a 
detection assay. 

Figure 28 demonstrates that "nibbling" can be target directed. 

10 DESCRIPTION OF THE INVENTION 

The present invention relates to means for cleaving a nucleic acid cleavage 
structure in a site-specific manner. In particular, the present invention relates to a 
cleaving enzyme having 5' nuclease activity without interfering nucleic acid 
synthetic ability. 

15 This invention provides 5' nucleases derived from thermostable DNA 

polymerases which exhibit altered DNA synthetic activity from that of native 
thermostable DNA polymerases. The 5' nuclease activity of the polymerase is 
retained while the synthetic activity is reduced or absent. Such 5' nucleases are 
capable of catalyzing the structure-specific cleavage of nucleic acids in the absence 

20 of interfering synthetic activity. The lack of synthetic activity during a cleavage 

reaction results in nucleic acid cleavage products of uniform size. 

The novel properties of the polymerases of the invention form the basis of a 
method of detecting specific nucleic acid sequences. This method relies upon the 
amplification of the detection molecule rather than upon the amplification of the 

25 target sequence itself as do existing methods of detecting specific target sequences. 

DNA polymerases (DNAPs), such as those isolated from E. coli or from 
thermophilic bacteria of the genus Thermits, are enzymes that synthesize new DNA 
strands. Several of the known DNAPs contain associated nuclease activities in 
addition to the synthetic activity of the enzyme. 

30 Some DNAPs are known to remove nucleotides from the 5' and 3' ends of 

DNA chains [Kornberg, DNA Replication, W.H. Freeman and Co., San Francisco, 
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pp. 127-139 (1980)]. These nuclease activities are usually referred to as 5' 
exonuclease and 3' exonuclease activities, respectively. For example, the 5' 
exonuclease activity located in the N-terminal domain of several DNAPs 
participates in the removal of RNA primers during lagging strand synthesis during 
5 DNA replication and the removal of damaged nucleotides during repair. Some 

DNAPs, such as the E. coli DNA polymerase (DNAPEcl), also have a 3' 
exonuclease activity responsible for proof-reading during DNA synthesis 
(Kornberg, supra), 

A DNAP isolated from Thermus aquaticus, termed Tag DNA polymerase 
10 (DNAP Tag), has a 5 5 exonuclease activity, but lacks a functional 3' exonucleolytic 

domain [Tindall and Kunkell, Biochem. 27:6008 (1988)]. Derivatives of DNAPEcl 
and DNAP Tag, respectively called the Klenow and Stoffel fragments, lack 5' 
exonuclease domains as a result of enzymatic or genetic manipulations [Brutlag et 
aL, Biochem. Biophys. Res. Commun. 37:982 (1969); Erlich et al 9 Science 
15 252:1643 (1991); Setlow and Kornberg, J. Biol. Chem. 247:232 (1972)]. 

The 5' exonuclease activity of DNAP Tag was reported to require concurrent 
synthesis [Gelfand, PCR Technology - Principles and Applications for DNA 
Amplification (H.A. Erlich, Ed.), Stockton Press, New York, p. 19 (1989)]. 
Although mononucleotides predominate among the digestion products of the 5 ' 
20 exonucleases of DNAPTag and DNAPEcl, short oligonucleotides (< 12 

nucleotides) can also be observed implying that these so-called 5' exonucleases can 
function endonucleolytically [Setlow, supra; Holland et aL, Proc. Natl Acad. Scl 
USA 88:7276 (1991)]. 

In WO 92/06200, Gelfand et al. show that the preferred substrate of the 5' 
25 exonuclease activity of the thermostable DNA polymerases is displaced single- 

stranded DNA. Hydrolysis of the phosphodiester bond occurs between the 
displaced single-stranded DNA and the double-helical DNA with the preferred 
exonuclease cleavage site being a phosphodiester bond in the double helical region. 
Thus, the 5' exonuclease activity usually associated with DNAPs is a structure- 
30 dependent single-stranded endonuclease and is more properly referred to as a 5' 

nuclease. Exonucleases are enzymes which cleave nucleotide molecules from the 
ends of the nucleic acid molecule. Endonucleases, on the other hand, are enzymes 
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which cleave the nucleic acid molecule at internal rather than terminal sites. The 
nuclease activity associated with some thermostable DNA polymerases cleaves 
endonucleolytically but this cleavage requires contact with the 5' end of the 
molecule being cleaved. Therefore, these nucleases are referred to as 5' nucleases. 
5 When a 5' nuclease activity is associated with a eubacterial Type A DNA 

polymerase, it is found in the one-third N-terminal region of the protein as an 
independent functional domain. The C-terminal two-thirds of the molecule 
constitute the polymerization domain which is responsible for the synthesis of 
DNA. Some Type A DNA polymerases also have a 3' exonuclease activity 

10 associated with the two-third C-terminal region of the molecule. 

The 5' exonuclease activity and the polymerization activity of DNAPs have 
been separated by proteolytic cleavage or genetic manipulation of the polymerase 
molecule. To date thermostable DNAPs have been modified to remove or reduce 
the amount of 5' nuclease activity while leaving the polymerase activity intact. 

15 The Klenow or large proteolytic cleavage fragment of DNAPEcl contains 

the polymerase and 3' exonuclease activity but lacks the 5' nuclease activity. The 
Stoffel fragment of UNAPTag lacks the 5' nuclease activity due to a genetic 
manipulation which deleted the N-terminal 289 amino acids of the polymerase 
molecule [Erlich et ah, Science 252:1643 (1991)]. WO 92/06200 describes a 

20 thermostable DNAP with an altered level of 5' to 3' exonuclease. U.S. Patent No. 

5,108,892 describes a Thermus aquaticus DNAP without a 5' to 3' exonuclease. 
However, the art of molecular biology lacks a thermostable DNA polymerase with 
a lessened amount of synthetic activity. 

The present invention provides 5' nucleases derived from thermostable Type 

25 A DNA polymerases that retain 5' nuclease activity but have reduced or absent 

synthetic activity. The ability to uncouple the synthetic activity of the enzyme 
from the 5' nuclease activity proves that the 5' nuclease activity does not require 
concurrent DNA synthesis as was previously reported (Gelfand, PCR Technology, 
supra). 

30 The description of the invention is divided into: I. Detection of Specific 

Nucleic Acid Sequences Using 5' Nucleases; II. Generation of 5' Nucleases 
Derived From Thermostable DNA Polymerases; III. Therapeutic Uses of 5 5 
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Nucleases; and IV. Detection of Antigenic or Nucleic Acid Targets by a Dual 
Capture Assay. To facilitate understanding of the invention, a number of terms are 
defined below. 

The term "gene" refers to a DNA sequence that comprises control and 
5 coding sequences necessary for the production of a polypeptide or precursor. The 

polypeptide can be encoded by a full length coding sequence or by any portion of 
the coding sequence so long as the desired enzymatic activity is retained. 

The term "wild-type" refers to a gene or gene product which has the 
characteristics of that gene or gene product when isolated from a naturally 

10 occurring source. In contrast, the term "modified" or mutant" refers to a gene or 

gene product which displays altered characteristics when compared to the wild-type 
gene or gene product. It is noted that naturally-occurring mutants can be isolated; 
these are identified by the fact that they have altered characteristics when compared 
to the wild-type gene or gene product. 

15 The term "recombinant DNA vector" as used herein refers to DNA 

sequences containing a desired coding sequence and appropriate DNA sequences 
necessary for the expression of the operably linked coding sequence in a particular 
host organism. DNA sequences necessary for expression in procaryotes include a 
promoter, optionally an operator sequence, a ribosome binding site and possibly 

20 other sequences. Eucaryotic cells are known to utilize promoters, polyadenlyation 

signals and enhancers. 

The term "oligonucleotide" as used herein is defined as a molecule 
comprised of two or more deoxyribonucleotides or ribonucleotides, preferably more 
than three, and usually more than ten. The exact size -will depend on many factors, 

25 which in turn depends on the ultimate function or use of the oligonucleotide. The 

oligonucleotide may be generated in any manner, including chemical synthesis, 
DNA replication, reverse transcription, or a combination thereof. 

Because mononucleotides are reacted to make oligonucleotides in a manner 
such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' 

30 oxygen of its neighbor in one direction via a phoshodiester linkage, an end of an 

oligonucleotide is referred to as the "5' end" if its 5' phosphate is not linked to the 
3' oxygen of a mononucleotide pentose ring and as the "3' end" if its 3' oxygen is 
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not linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used 
herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also 
may be said to have 5' and 3' ends. 

When two different, non-overlapping oligonucleotides anneal to different 
5 regions of the same linear complementary nucleic acid sequence, and the 3 ' end of 

one oligonucleotide points towards the 5' end of the other, the former may be 
called the "upstream" oligonucleotide and the latter the "downstream" 
oligonucleotide. 

The term "primer" refers to an oligonucleotide which is capable of acting as 

10 a point of initiation of synthesis when place under conditions in which primer 

extension is initiated. An oligonucleotide "primer" may occur naturally, as in a 
purified restriction digest or may be produced synthetically. 

A primer is selected to be "substantially" complementary to a strand of 
specific sequence of the template. A primer must be sufficiently complementary to 

1 5 hybridize with a template strand for primer elongation to occur. A primer 

sequence need not reflect the exact sequence of the template. For example, a non- 
complementary nucleotide fragment may be attached to the 5' end of the primer, 
with the remainder of the primer sequence being substantially complementary to the 
strand. Non-complementary bases or longer sequences can be interspersed into the 

20 primer, provided that the primer sequence has sufficient complementarity with the 

sequence of the template to hybridize and thereby form a template primer complex 
for synthesis of the extension product of the primer. 

The complement of a nucleic acid sequence as used herein refers to an 
oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' 

25 end of one sequence is paired with the 3' end of the other, is in "antiparallel 

association." Certain bases not commonly found in natural nucleic acids may be 
included in the nucleic acids of the present invention and include, for example, 
inosine and 7-deazaguanine. Complementarity need not be perfect; stable duplexes 
may contain mismatched base pairs or unmatched bases. Those skilled in the art of 

30 nucleic acid technology can determine duplex stability empirically considering a 

number of variables including, for example, the length of the oligonucleotide, base 
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composition and sequence of the oligonucleotide, ionic strength and incidence of 
mismatched base pairs. 

Stability of a nucleic acid duplex is measured by the melting temperature, or 
f T m ." The T m of a particular nucleic acid duplex under specified conditions is the 
5 temperature at which half of the base pairs have disassociated. 

The term "probe" as used herein refers to a labeled oligonucleotide which 
forms a duplex structure with a sequence in another nucleic acid, due to 
complementarity of at least one sequence in the probe with a sequence in the other 
nucleic acid. 

10 The term "label" as used herein refers to any atom or molecule which can 

be used to provide a detectable (preferably quantifiable) signal, and which can be 
attached to a nucleic acid or protein. Labels may provide signals detectable by 
fluorescence, radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption, 
magnetism, enzymatic activity, and the like. 

15 The term "cleavage structure" as used herein, refers to a nucleic acid 

structure which is a substrate for cleavage by the 5' nuclease activity of a DNAP. 

The term "cleavage means" as used herein refers to any means which is 
capable of cleaving a cleavage structure in a specific manner. The cleavage means 
may include native DNAPs having 5' nuclease activity, and, more specifically, 

20 modified DNAPs having 5' nuclease but lacking synthetic activity. 

The term "liberating" as used herein refers to the release of a nucleic acid 
fragment from a larger nucleic acid fragment, such as an oligonucleotide, by the 
action of a 5' nuclease such that the released fragment is no longer covalently 
attached to the remainder of the oligonucleotide. 

25 The term "substrate strand" as used herein, means that strand of nucleic acid 

in a cleavage structure in which the cleavage mediated by the 5' nuclease activity 
occurs. 

The term "template strand" as used herein, means that strand of nucleic acid 
in a cleavage structure which is at least partially complementary to the substrate 
30 strand and which anneals to the substrate strand to form the cleavage structure. 

The term "K^" as used herein refers to the Michaelis-Menten constant for an 
enzyme and is defined as the concentration of the specific substrate at which a 
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given enzyme yields one-half its maximum velocity in an enzyme catalyzed 
reaction. 



I. Detection Of Specific Nucleic Acid Sequences Using 5' Nucleases 

The 5 ? nucleases of the invention form the basis of a novel detection assay 
5 for the identification of specific nucleic acid sequences. This detection system 

identifies the presence of specific nucleic acid sequences by requiring the annealing 
of two oligonucleotide probes to two portions of the target sequence. As used 
herein, the term "target sequence" or target nucleic acid sequence" refers to a 
specific nucleic acid sequence within a polynucleotide sequence, such as genomic 
10 DNA or RNA, which is to be either detected or cleaved or both. 

Figure 1A provides a schematic of one embodiment of the detection method 
of the present invention. The target sequence is recognized by two distinct 
oligonucleotides in the triggering or trigger reaction. It is preferred that one of 
these oligonucleotides is provided on a solid support. The other can be provided 
15 free. In Figure 1A the free oligo is indicated as a "primer" and the other oligo is 

shown attached to a bead designated as type 1 . The target nucleic acid aligns the 
two oligonucleotides for specific cleavage of the 5 5 arm (of the oligo on bead 1) by 
the DNAPs of the present invention (not shown in Figure 1A). 

The site of cleavage (indicated by a large solid arrowhead) is controlled by 
20 the distance between the 3' end of the "primer" and the downstream fork of the 

oligo on bead 1 . The latter is designed with an uncleavable region (indicated by 
the striping). In this manner neither oligonucleotide is subject to cleavage when 
misaligned or when unattached to target nucleic acid. . 

Successful cleavage releases a single copy of what is referred to as the alpha 
25 signal oligo. This oligo may contain a detectable moiety (e.g., fluorescein). On 

the other hand, it may be unlabelled. 

In one embodiment of the detection method, two more oligonucleotides are 
provided on solid supports. The oligonucleotide shown in Figure 1A on bead 2 has 
a region that is complementary to the alpha signal oligo (indicated as alpha prime) 
30 allowing for hybridization. This structure can be cleaved by the DNAPs of the 

present invention to release the beta signal oligo. The beta signal oligo can then 
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hybridize to type 3 beads having an oligo with a complementary region (indicated 
as beta prime). Again, this structure can be cleaved by the DNAPs of the present 
invention to release a new alpha oligo. 

At this point, the amplification has been linear. To increase the power of the 
5 method, it is desired that the alpha signal oligo hybridized to bead type 2 be 

liberated after release of the beta oligo so that it may go on to hybridize with other 
oligos on type 2 beads. Similarly, after release of an alpha oligo from type 3 
beads, it is desired that the beta oligo be liberated. 

The liberation of "captured" signal oligos can be achieved in a number of 
10 ways. First, it has been found that the DNAPs of the present invention have a true 

5' exonuclease capable of "nibbling" the 5' end of the alpha (and beta) prime oligo 
(discussed below in more detail). Thus, under appropriate conditions, the 
hybridization is destabilized by nibbling of the DNAP. Second, the alpha - alpha 
prime (as well as the beta - beta prime) complex can be destablized by heat (e.g., 
15 thermal cycling). 

With the liberation of signal oligos by such techniques, each cleavage 
results in a doubling of the number of signal oligos. In this manner, detectable 
signal can quickly be achieved. 

Figure IB provides a schematic of a second embodiment of the detection 
20 method of the present invention. Again, the target sequence is recognized by two 

distinct oligonucleotides in the triggering or trigger reaction and the target nucleic 
acid aligns the two oligonucleotides for specific cleavage of the 5' arm by the 
DNAPs of the present invention (not shown in Figure IB). The first oligo is 
completely complementary to a portion of the target sequence. The second 
25 oligonucleotide is partially complementary to the target sequence; the 3' end of the 

second oligonucleotide is fully complementary to the target sequence while the 5' 
end is non-complementary and forms a single-stranded arm. The non- 
complementary end of the second oligonucleotide may be a generic sequence which 
can be used with a set of standard hairpin structures (described below). The 
30 detection of different target sequences would require unique portions of two 

oligonucleotides: the entire first oligonucleotide and the 3' end of the second 
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oligonucleotide. The 5' arm of the second oligonucleotide can be invariant or 
generic in sequence. 

The annealing of the first and second oligonucleotides near one another 
along the target sequence forms a forked cleavage structure which is a substrate for 
5 the 5 9 nuclease of DNA polymerases. The approximate location of the cleavage 

site is again indicated by the large solid arrowhead in Figure IB. 

The 5' nucleases of the invention are capable of cleaving this structure but 
are not capable of polymerizing the extension of the 3 ' end of the first 
oligonucleotide. The lack of polymerization activity is advantageous as extension 
1 0 of the first oligonucleotide results in displacement of the annealed region of the 

second oligonucleotide and results in moving the site of cleavage along the second 
oligonucleotide. If polymerization is allowed to occur to any significant amount, 
multiple lengths of cleavage product will be generated. A single cleavage product 
of uniform length is desirable as this cleavage product initiates the detection 
1 5 reaction. 

The trigger reaction may be run under conditions that allow for 
thermocy cling. Thermocy cling of the reaction allows for a logarithmic increase in 
the amount of the trigger oligonucleotide released in the reaction. 

The second part of the detection method allows the annealing of the 

20 fragment of the second oligonucleotide liberated by the cleavage of the first 

cleavage structure formed in the triggering reaction (called the third or trigger 
oligonucleotide) to a first hairpin structure. This first hairpin structure has a single- 
stranded 5' arm and a single-stranded 3' arm. The third oligonucleotide triggers 
the cleavage of this first hairpin structure by annealing to the 3 ' arm of the hairpin 

25 thereby forming a substrate for cleavage by the 5' nuclease of the present 

invention. The cleavage of this first hairpin structure generates two reaction 
products: 1) the cleaved 5' arm of the hairpin called the fourth oligonucleotide, 
and 2) the cleaved hairpin structure which now lacks the 5' arm and is smaller in 
size than the uncleaved hairpin. This cleaved first hairpin may be used as a 

30 detection molecule to indicate that cleavage directed by the trigger or third 

oligonucleotide occurred. Thus, this indicates that the first two oligonucleotides 
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found and annealed to the target sequence thereby indicating the presence of the 
target sequence in the sample. 

The detection products are amplified by having the fourth oligonucleotide 
anneal to a second hairpin structure. This hairpin structure has a 5' single-stranded 
5 arm and a 3' single-stranded arm. The fourth oligonucleotide generated by 

cleavage of the first hairpin structure anneals to the 3' arm of the second hairpin 
structure thereby creating a third cleavage structure recognized by the 5' nuclease, 
The cleavage of this second hairpin structure also generates two reaction products: 
1 ) the cleaved 5 ' arm of the hairpin called the fifth oligonucleotide which is similar 

10 or identical in sequence to the third nucleotide, and 2) the cleaved second hairpin 

structure which now lacks the 5' arm and is smaller in size than the uncleaved 
hairpin. This cleaved second hairpin may be as a detection molecule and amplifies 
the signal generated by the cleavage of the first hairpin structure. Simultaneously 
with the annealing of the forth oligonucleotide, the third oligonucleotide is 

1 5 dissociated from the cleaved first hairpin molecule so that it is free to anneal to a 

new copy of the first hairpin structure. The disassociation of the oligonucleotides 
from the hairpin structures may be accomplished by heating or other means suitable 
to disrupt base-pairing interactions. 

Further amplification of the detection signal is achieved by annealing the 

20 fifth oligonucleotide (similar or identical in sequence to the third oligonucleotide) 

to another molecule of the first hairpin structure. Cleavage is then performed and 
the oligonucleotide that is liberated then is annealed to another molecule of the 
second hairpin structure. Successive rounds of annealing and cleavage of the first 
and second hairpin structures, provided in excess, are performed to generate a 

25 sufficient amount of cleaved hairpin products to be detected. The temperature of 

the detection reaction is cycled just below and just above the annealing temperature 
for the oligonucleotides used to direct cleavage of the hairpin structures, generally 
about 55°C to 70°C. The number of cleavages will double in each cycle until the 
amount of hairpin structures remaining is below the for the hairpin structures. 

30 This point is reached when the hairpin structures are substantially used up. When 

the detection reaction is to be used in a quantitative manner, the cycling reactions 
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are stopped before the accumulation of the cleaved hairpin detection products reach 
a plateau. 

Detection of the cleaved hairpin structures may be achieved in several ways. 
In one embodiment detection is achieved by separation on agarose or 
5 poly aery lamide gels followed by staining with ethidium bromide. In another 

embodiment, detection is achieved by separation of the cleaved and uncleaved 
hairpin structures on a gel followed by autoradiography when the hairpin structures 
are first labelled with a radioactive probe and separation on chromatography 
columns using HPLC or FPLC followed by detection of the differently sized 
1 0 fragments by absorption at OD 260 . Other means of detection include detection of 

changes in fluorescence polarization when the single- stranded 5' arm is released by 
cleavage, the increase in fluorescence of an intercalating fluorescent indicator as the 
amount of primers annealed to 3' arms of the hairpin structures increases. The 
formation of increasing amounts of duplex DNA (between the primer and the 3 ' 
1 5 arm of the hairpin) occurs if successive rounds of cleavage occur. 

The hairpin structures may be attached to a solid support, such as an 
agarose, styrene or magnetic bead, via the 3 ' end of the hairpin. A spacer 
molecule may be placed between the 3' end of the hairpin and the bead, if so 
desired. The advantage of attaching the hairpin structures to a solid support is that 
20 this prevents the hybridization of the two hairpin structures to one another over 

regions which are complementary. If the hairpin structures anneal to one another, 
this would reduce the amount of hairpins available for hybridization to the primers 
released during the cleavage reactions. If the hairpin structures are attached to a 
solid support, then additional methods of detection of the products of the cleavage 
25 reaction may be employed. These methods include, but are not limited to, the 

measurement of the released single-stranded 5' arm when the 5' arm contains a 
label at the 5' terminus. This label may be radioactive, fluorescent, biotinylated, 
etc. If the hairpin structure is not cleaved, the 5' label will remain attached to the 
solid support. If cleavage occurs, the 5' label will be released from the solid 
30 support. 

The 3 ' end of the hairpin molecule may be blocked through the use of 
dideoxynucleotides. A 3' terminus containing a dideoxynucleotide is unavailable to 
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participate in reactions with certain DNA modifying enzymes, such as terminal 
transferase. Cleavage of the hairpin having a 3' terminal dideoxynucleotide 
generates a new, unblocked 3' terminus at the site of cleavage. This new 3' end 
has a free hydroxyl group which can interact with terminal transferase thus 
5 providing another means of detecting the cleavage products. 

The hairpin structures are designed so that their self-complementary regions 
are very short (generally in the range of 3-8 base pairs). Thus, the hairpin 
structures are not stable at the high temperatures at which this reaction is performed 
(generally in the range of 50-75°C) unless the hairpin is stabilized by the presence 
1 0 of the annealed oligonucleotide on the 3 ' arm of the hairpin. This instability 

prevents the polymerase from cleaving the hairpin structure in the absence of an 
associated primer thereby preventing false positive results due to non- 
oligonucleotide directed cleavage. 

As discussed above, the use of the 5' nucleases of the invention which have 
1 5 reduced polymerization activity is advantageous in this method of detecting specific 

nucleic acid sequences. Significant amounts of polymerization during the cleavage 
reaction would cause shifting of the site of cleavage in unpredictable ways resulting 
in the production of a series of cleaved hairpin structures of various sizes rather 
than a single easily quantifiable product. Additionally, the primers used in one 
20 round of cleavage could, if elongated, become unusable for the next cycle, by 

either forming an incorrect structure or by being too long to melt off under 
moderate temperature cycling conditions. In a pristine system (i. e. , lacking the 
presence of dNTPs), one could use the unmodified polymerase, but the presence of 
nucleotides (dNTPs) can decrease the per cycle efficiency enough to give a false 
25 negative result. When a crude extract (genomic DNA preparations, crude cell 

lysates, etc. ) is employed or where a sample of DNA from a PCR reaction, or any 
other sample that might be contaminated with dNTPs, the 5' nucleases of the 
present invention that were derived from thermostable polymerases are particularly 
useful. 
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II. Generation Of 5' Nucleases From Thermostable DNA 
Polymerases 

The genes encoding Type A DNA polymerases share about 85% homology 
to each other on the DNA sequence level. Preferred examples of thermostable 
5 polymerases include those isolated from Thermus aquaticus, Thermus flavus, and 

Thermits thermophilics. However, other thermostable Type A polymerases which 
have 5' nuclease activity are also suitable. Figs. 2 and 3 compare the nucleotide 
and amino acid sequences of the three above mentioned polymerases. SEQ ID 
NOS:l-3 display the nucleotide sequences and SEQ ID NOS:4-6 display the amino 

10 acid sequences of the three wild-type polymerases. SEQ ID NO:l corresponds to 

the nucleic acid sequence of the wild type Thermus aquaticus DNA polymerase 
gene isolated from the YT-1 strain [Lawyer et aL, J. Biol Chem. 264:6427 
(1989)]. SEQ ID NO:2 corresponds to the nucleic acid sequence of the wild type 
Thermus flavus DNA polymerase gene [Akhmetzjanov and Vakhitov, Nucl Acids 

15 Res. 20:5839 (1992)]. SEQ ID NO:3 corresponds to the nucleic acid sequence of 

the wild type Thermus thermophilus DNA polymerase gene [Gelfand et al^ 
WO 91/09950 (1991)]. SEQ ID NOS:7-8 depict the consensus nucleotide and 
amino acid sequences, respectively for the above three DNAPs (also shown on the 
top row in Figs. 2 and 3). 

20 The 5' nucleases of the invention derived from thermostable polymerases 

have reduced synthetic ability, but retain substantially the same 5' exonuclease 
activity as the native DNA polymerase. The term "substantially the same 5' 
nuclease activity" as used herein means that the 5' nuclease activity of the modified 
enzyme retains the ability to function as a structure-dependent single-stranded 

25 endonuclease but not necessarily, at the same rate of cleavage as compared to the 

unmodified enzyme. Type A DNA polymerases may also be modified so as to 
produce an enzyme which has increases 5' nuclease activity while having a reduced 
level of synthetic activity. Modified enzymes having reduced synthetic activity and 
increased 5' nuclease activity are also envisioned by the present invention. 

30 By the term "reduced synthetic activity" as used herein it is meant that the 

modified enzyme has less than the level of synthetic activity found in the 
unmodified or "native" enzyme. The modified enzyme may have no synthetic 
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activity remaining or may have that level of synthetic activity that will not interfere 
with the use of the modified enzyme in the detection assay described below. The 
5 ' nucleases of the present invention are advantageous in situations where the 
cleavage activity of the polymerase is desired, but the synthetic ability is not (such 
5 as in the detection assay of the invention). 

As noted above, it is not intended that the invention be limited by the nature 
of the alteration necessary to render the polymerase synthesis deficient. The 
present invention contemplates a variety of methods, including but not limited to: 
1) proteolysis; 2) recombinant constructs (including mutants); and 3) physical 
10 and/or chemical modification and/or inhibition. 

1. Proteolysis 

Thermostable DNA polymerases having a reduced level of synthetic activity 
are produced by physically cleaving the unmodified enzyme with proteolytic 
enzymes to produce fragments of the enzyme that are deficient in synthetic activity 
15 but retain 5' nuclease activity. Following proteolytic digestion, the resulting 

fragments are separated by standard chromatographic techniques and assayed for 
the ability to synthesize DNA and to act as a 5' nuclease. The assays to determine 
synthetic activity and 5' nuclease activity are described below. 

2. Recombinant Constructs 

20 The examples below describe a preferred method for creating a construct 

encoding a 5' nuclease derived from a thermostable DNA polymerase. As the 
Type A DNA polymerases are similar in DNA sequence, the cloning strategies 
employed for the Thermus aquaticus and flavus polymerases are applicable to other 
thermostable Type A polymerases. In general, a thermostable DNA polymerase is 

25 cloned by isolating genomic DNA using molecular biological methods from a 

bacteria containing a thermostable Type A DNA polymerase. This genomic DNA 
is exposed to primers which are capable of amplifying the polymerase gene by 
PCR. 
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This amplified polymerase sequence is then subjected to standard deletion 
processes to delete the polymerase portion of the gene. Suitable deletion processes 
are described below in the examples. 

The example below discusses the strategy used to determine which portions 
of the DNAPTaq polymerase domain could be removed without eliminating the 5' 
nuclease activity. Deletion of amino acids from the protein can be done either by 
deletion of the encoding genetic material, or by introduction of a translational stop 
codon by mutation or frame shift. In addition, proteolytic treatment of the protein 
molecule can be performed to remove segments of the protein. 

In the examples below, specific alterations of the Taq gene were: a deletion 
between nucleotides 1601 and 2502 (the end of the coding region), a 4 nucleotide 
insertion at position 2043, and deletions between nucleotides 1614 and 1848 and 
between nucleotides 875 and 1778 (numbering is as in SEQ ID NO:l). These 
modified sequences are described below in the examples and at SEQ ID NOS:9-12. 

Those skilled in the art understand that single base pair changes can be 
innocuous in terms of enzyme structure and function. Similarly, small additions 
and deletions can be present without substantially changing the exonuclease or 
polymerase function of these enzymes. 

Other deletions are also suitable to create the 5 ' nucleases of the present 
invention. It is preferable that the deletion decrease the polymerase activity of the 
5' nucleases to a level at which synthetic activity will not interfere with the use of 
the 5' nuclease in the detection assay of the invention. Most preferably, the 
synthetic ability is absent. Modified polymerases are tested for the presence of 
synthetic and 5' nuclease activity as in assays described below. Thoughtful 
consideration of these assays allows for the screening of candidate enzymes whose 
structure is heretofore as yet unknown. In other words, construct "X" can be 
evaluated according to the protocol described below to determine whether it is a 
member of the genus of 5 ' nucleases of the present invention as defined 
functionally, rather than structurally. 

In the example below, the PCR product of the amplified Thermus aquaticus 
genomic DNA did not have the identical nucleotide structure of the native genomic 
DNA and did not have the same synthetic ability of the original clone. Base pair 
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changes which result due to the infidelity of DNA? Tag during PCR amplification 
of a polymerase gene are also a method by which the synthetic ability of a 
polymerase gene may be inactivated. The examples below and Figs. 4A and 5A 
indicate regions in the native Thermits aquaticus and flavus DNA polymerases 
5 likely to be important for synthetic ability. There are other base pair changes and 

substitutions that will likely also inactivate the polymerase. 

It is not necessary, however, that one start out the process of producing a 5' 
nuclease from a DNA polymerase with such a mutated amplified product. This is 
the method by which the examples below were performed to generate the synthesis- 
0 deficient UNAPTag mutants, but it is understood by those skilled in the art that a 

wild-type DNA p9lymerase sequence may be used as the starting material for the 
introduction of deletions, insertion and substitutions to produce a 5' nuclease. For 
example, to generate the synthesis-deficient DNAPTfl mutant, the primers listed in 
SEQ ID NOS:13-14 were used to amplify the wild type DNA polymerase gene 
5 from Thermus flavus strain AT-62. The amplified polymerase gene was then 

subjected to restriction enzyme digestion to delete a large portion of the domain 
encoding the synthetic activity. 

The present invention contemplates that the nucleic acid construct of the 
present invention be capable of expression in a suitable host. Those in the art 
know methods for attaching various promoters and 3 5 sequences to a gene structure 
to achieve efficient expression. The examples below disclose two suitable vectors 
and six suitable vector constructs. Of course, there are other promoter/vector 
combinations that would be suitable. It is not necessary that a host organism be 
used for the expression of the nucleic acid constructs of the invention. For 
example, expression of the protein encoded by a nucleic acid construct may be 
achieved through the use of a cell-free in vitro transcription/translation system. An 
example of such a cell-free system is the commercially available TnT™ Coupled 
Reticulocyte Lysate System (Promega Corporation, Madison, WI). 

Once a suitable nucleic acid construct has been made, the 5' nuclease may 
be produced from the construct. The examples below and standard molecular 
biological teachings enable one to manipulate the construct by different suitable 
methods. 



-31- 



WO 94/29482 



PCT/US94/06253 



Once the 5' nuclease has been expressed, the polymerase is tested for both 
synthetic and nuclease activity as described below. 



3. Physical And/Or Chemical Modification And/Or 
Inhibition 

5 The synthetic activity of a thermostable DNA polymerase may be reduced 

by chemical and/or physical means. In one embodiment, the cleavage reaction 
catalyzed by the 5' nuclease activity of the polymerase is run under conditions 
which preferentially inhibit the synthetic activity of the polymerase. The level of 
synthetic activity need only be reduced to that level of activity which does not 

10 interfere with cleavage reactions requiring no significant synthetic activity. 

As shown in the examples below, concentrations of Mg ++ greater than 5 mM 
inhibit the polymerization activity of the native UNAPTaq. The ability of the 5' 
nuclease to function under conditions where synthetic activity is inhibited is tested 
by running the assays for synthetic and 5' nuclease activity, described below, in the 

15 presence of a range of Mg ++ concentrations (5 to 10 mM). The effect of a given 

concentration of Mg ++ is determined by quantitation of the amount of synthesis and 
cleavage in the test reaction as compared to the standard reaction for each assay. 

The inhibitory effect of other ions, polyamines, denaturants, such as urea, 
formamide, dimethylsulfoxide, glycerol and non-ionic detergents (Triton X-100 and 

20 Tween-20), nucleic acid binding chemicals such as, actinomycin D, ethidium 

bromide and psoralens, are tested by their addition to the standard reaction buffers 
for the synthesis and 5' nuclease assays. Those compounds having a preferential 
inhibitory effect on the synthetic activity of a thermostable polymerase are then 
used to create reaction conditions under which 5' nuclease activity (cleavage) is 

25 retained while synthetic activity is reduced or eliminated. 

Physical means may be used to preferentially inhibit the synthetic activity of 
a polymerase. For example, the synthetic activity of thermostable polymerases is 
destroyed by exposure of the polymerase to extreme heat (typically 96 to 100°C) 
for extended periods of time (greater than or equal to 20 minutes). While these are 

30 minor differences with respect to the specific heat tolerance for each of the 

enzymes, these are readily determined. Polymerases are treated with heat for ' 
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various periods of time and the effect of the heat treatment upon the synthetic and 
5' nuclease activities is determined. 

III. Therapeutic Utility Of 5' Nucleases 

The 5' nucleases of the invention have not only the diagnostic utility 
5 discussed above, but additionally have therapeutic utility for the cleavage and 

inactivation of specific mRNAs inside infected cells. The mRNAs of pathogenic 
agents, such as viruses, bacteria, are targeted for cleavage by a synthesis-deficient 
DNA polymerase by the introduction of a oligonucleotide complementary to a 
given mRNA produced by the pathogenic agent into the infected cell along with the 

10 synthesis-deficient polymerase. Any pathogenic agent may be targeted by this 

method provided the nucleotide sequence information is available so that an 
appropriate oligonucleotide may be synthesized. The synthetic oligonucleotide 
anneals to the complementary mRNA thereby forming a cleavage structure 
recognized by the modified enzyme. The ability of the 5' nuclease activity of 

1 5 thermostable DNA polymerases to cleave RNA-DNA hybrids is shown herein in 

Example ID. 

Liposomes provide a convenient delivery system. The synthetic 
oligonucleotide may be conjugated or bound to the nuclease to allow for co- 
delivery of these molecules. Additional delivery systems may be employed. 

20 Inactivation of pathogenic mRNAs has been described using antisense gene 

regulation and using ribozymes (Rossi, U.S. Patent No. 5,144,019, hereby 
incorporated by reference). Both of these methodologies have limitations. 

The use of antisense RNA to impair gene expression requires stoichiometric 
and therefore, large molar excesses of anti-sense RNA relative to the pathogenic 

25 RNA to be effective. Ribozyme therapy, on the other hand, is catalytic and 

therefore lacks the problem of the need for a large molar excess of the therapeutic 
compound found with antisense methods. However, ribozyme cleavage of a given 
RNA requires the presence of highly conserved sequences to form the catalytically 
active cleavage structure. This requires that the target pathogenic mRNA contain 

30 the conserved sequences (GAAAC (X) n GU) thereby limiting the number of 

pathogenic mRNAs that can be cleaved by this method. In contrast, the catalytic 
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cleavage of RNA by the use of a DNA oligonucleotide and a 5' nuclease is 
dependent upon structure only; thus, virtually any pathogenic RNA sequence can be 
used to design an appropriate cleavage structure. 

IV. Detection Of Antigenic Or Nucleic Acid Targets By A Dual 
5 Capture Assay 

The ability to generate 5' nucleases from thermostable DNA polymerases 
provides the basis for a novel means of detecting the presence of antigenic or 
nucleic acid targets. In this dual capture assay, the polymerase domains encoding 
the synthetic activity and the nuclease activity are covalently attached to two 
0 separate and distinct antibodies or oligonucleotides. When both the synthetic and 

the nuclease domains are present in the same reaction and dATP, dTTP and a small 
amount of poly d(A-T) are provided, an enormous amount of poly d(A-T) is 
produced. The large amounts of poly d(A-T) are produced as a result of the ability 
of the 5' nuclease to cleave newly made poly d(A-T) to generate primers that are, 
5 in turn, used by the synthetic domain to catalyze the production of even more poly 

d(A-T). The 5' nuclease is able to cleave poly d(A-T) because poly d(A-T) is self- 
complementary and easily forms alternate structures at elevated temperatures. 
These structures are recognized by the 5' nuclease and are then cleaved to generate 
more primer for the synthesis reaction. 

The following is an example of the dual capture assay to detect an 
antigen(s): A sample to be analyzed for a given antigen(s) is provided. This 
sample may comprise a mixture of cells; for example, cells infected with viruses 
display virally-encoded antigens on their surface. If the antigen(s) to be detected 
are present in solution, they are first attached to a solid support such as the wall of 
a microtiter dish or to a bead using conventional methodologies. The sample is 
then mixed with 1) the synthetic domain of a thermostable DNA polymerase 
conjugated to an antibody which recognizes either a first antigen or a first epitope 
on an antigen, and 2) the 5' nuclease domain of a thermostable DNA polymerase 
conjugated to a second antibody which recognizes either a second, distinct antigen 
or a second epitope on the same antigen as recognized by the antibody conjugated 
to the synthetic domain. Following an appropriate period to allow the interaction 
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of the antibodies with their cognate antigens (conditions will vary depending upon 
the antibodies used; appropriate conditions are well known in the art), the sample is 
then washed to remove unbound antibody-enzyme domain complexes. dATP, 
dTTP and a small amount of poly d(A-T) is then added to the washed sample and 
5 the sample is incubated at elevated temperatures (generally in the range of 60-80°C 

and more preferably, 70-75 °C) to permit the thermostable synthetic and 5' nuclease 
domains to function. If the sample contains the antigen(s) recognized by both 
separately conjugated domains of the polymerase, then an exponential increase in 
poly d(A-T) production occurs. If only the antibody conjugated to the synthetic 

10 domain of the polymerase is present in the sample such that no 5' nuclease domain 

is present in the washed sample, then only an arithmetic increase in poly d(A-T) is 
possible. The reaction conditions may be controlled in such a way so that an 
arithmetic increase in poly d(A-T) is below the threshold of detection. This may 
be accomplished by controlling the length of time the reaction is allowed to 

1 5 proceed or by adding so little poly d(A-T) to act as template that in the absence of 

nuclease activity to generate new poly d(A-T) primers very little poly d(A-T) is 
synthesized. 

It is not necessary for both domains of the enzyme to be conjugated to an 
antibody. One can provide the synthetic domain conjugated to an antibody and 
20 provide the 5' nuclease domain in solution or vice versa. In such a case the 

conjugated antibody-enzyme domain is added to the sample, incubated, then 
washed. dATP, dTTP, poly d(A-T) and the remaining enzyme domain in solution 
is then added. 

Additionally, the two enzyme domains may be . conjugated to 
25 oligonucleotides such that target nucleic acid sequences can be detected. The 

oligonucleotides conjugated to the two different enzyme domains may recognize 
different regions on the same target nucleic acid strand or may recognize two 
unrelated target nucleic acids. 

The production of poly d(A-T) may be detected in many ways including: 
30 1) use of a radioactive label on either the dATP or dTTP supplied for the synthesis 

of the poly d(A-T), followed by size separation of the reaction products and 
autoradiography; 2) use of a fluorescent probe on the dATP and a biotinylated 
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probe on the dTTP supplied for the synthesis of the poly d(A-T), followed by 
passage of the reaction products over an avidin bead, such as magnetic beads 
conjugated to avidin; the presence of the florescent probe on the avidin-containing 
bead indicates that poly d(A-T) has been formed as the fluorescent probe will stick 
5 to the avidin bead only if the fluorescenated dATP is incorporated into a covalent 

linkage with the biotinylated dTTP; and 3) changes fluorescence polarization 
indicating an increase in size. Other means of detecting the presence of poly d(A- 
T) include the use of intercalating fluorescence indicators to monitor the increase in 
duplex DNA formation. 

The advantages of the above dual capture assay for detecting antigenic or 
nucleic acid targets include: 

1) No thermocycling of the sample is required. The polymerase 
domains and the dATP and dTTP are incubated at a fixed temperature (generally 
about 70°C). After 30 minutes of incubation up to 75% of the added dNTPs are 
incorporated into poly d(A-T). The lack of thermocycling makes this assay well 
suited to clinical laboratory settings; there is no need to purchase a thermocycling 
apparatus and there is no need to maintain very precise temperature control. 

2) The reaction conditions are simple. The incubation of the bound 
enzymatic domains is done in a buffer containing 0.5 mM MgCl 2 (higher 
concentrations may be used), 2-10 mM Tris-Cl, pH 8.5, approximately 50 |jM 
dATP and dTTP. The reaction volume is 10-20 jal and reaction products are 
detectable within 10-20 minutes. 

3) No reaction is detected unless both the synthetic and nuclease 
activities are present. Thus, a positive result indicates that both probes (antibody or 
oligonucleotide) have recognized their targets thereby increasing the specificity of 
recognition by having two different probes bind to the target. 

The ability to separate the two enzymatic activities of the DNAP allows for 
exponential increases in poly d(A-T) production. If a DNAP is used which lacks 
5' nuclease activity, such as te Klenow fragment of DNAPEcl, only a linear or 
arithmetic increase in poly d(A-T) production is possible [Setlow et aL, J. Biol. 
Chem. 247:224 (1972)]. The ability to provide an enzyme having 5' nuclease 
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activity but lacking synthetic activity is made possible by the disclosure of this 
invention. 

EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments 
and aspects of the present invention and are not to be construed as limiting the 
scope thereof 

In the disclosure which follows, the following abbreviations apply :°C 
(degrees Centigrade); g (gravitational field); vol (volume); w/v (weight to volume); 
v/v (volume to volume); BSA (bovine serum albumin); CTAB 
(cetyltrimethylammonium bromide); HPLC (high pressure liquid chromatography); 
DNA (deoxyribonucleic acid); p (plasmid); jil (microliters); ml (milliliters); \ig 
(micrograms); pmoles (picomoles); mg (milligrams); M (molar); mM (milliMolar); 
HM (microMolar); nm (nanometers); kdal (kilodaltons); OD (optical density); 
EDTA (ethylene diamine tetra-acetic acid); FITC (fluorescein isothiocyanate); SDS 
(sodium dodecyl sulfate); NaP0 4 (sodium phosphate); Tris (tris(hydroxymethyl)- 
aminomethane); PMSF (phenylmethylsulfonylfluoride); TBE (Tris-Borate-EDTA, 
I e, , Tris buffer titrated with boric acid rather than HC1 and containing EDTA) ; 
PBS (phosphate buffered saline); PPBS (phosphate buffered saline containing 1 
mM PMSF); PAGE (poly aery lamide gel electrophoresis); Tween (polyoxyethylene- 
sorbitan); Dynal (Dynal A.S., Oslo, Norway); Epicentre (Epicentre Technologies, 
Madison, WI); National Biosciences (Plymouth, MN); New England Biolabs 
(Beverly, MA); Novagen (Novagen, Inc., Madison, WI); Perkin Elmer (Norwalk, 
CT); Promega Corp. (Madison, WI); Stratagene (Stratagene Cloning Systems, La 
Jolla, CA); USB (U.S. Biochemical, Cleveland, OH). 
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EXAMPLE 1 

Characteristics Of Native Thermostable DNA Polymerases 
A. 5' Nuclease Activity Of BNAPTaq 

During the polymerase chain reaction (PCR) [Saiki et al. 9 Science 239:487 
5 (1988); Mullis and Faloona, Methods in Enzymology 155:335 (1987)], UNAVTaq is 

able to amplify many, but not all, DNA sequences. One sequence that cannot be 
amplified using DNAP7a# is shown in Figure 6 (Hairpin structure is SEQ ID 
NO:15 5 PRIMERS are SEQ ID NOS:16-17.) This DNA sequence has the 
distinguishing characteristic of being able to fold on itself to form a hairpin with 

1 0 two single-stranded arms, which correspond to the primers used in PCR. 

To test whether this failure to amplify is due to the 5' nuclease activity of 
the enzyme, we compared the abilities of DNAP7a# and DNAPStf to amplify this 
DNA sequence during 30 cycles of PCR. Synthetic oligonucleotides were obtained 
from The Biotechnology Center at the University of Wisconsin-Madison. The 

15 DNAP7a# and DNAPStf were from Perkin Elmer {i.e., Amplitaq™ DNA 

polymerase and the Stoffel fragment of Amplitaq™ DNA polymerase). The 
substrate DNA comprised the hairpin structure shown in Figure 6 cloned in a 
double-stranded form into pUC19. The primers used in the amplification are listed 
as SEQ ID NOS:16-17. Primer SEQ ID NO:17 is shown annealed to the 3' arm of 

20 the hairpin structure in Fig. 6. Primer SEQ ID NO: 16 is shown as the first 20 

nucleotides in bold on the 5' arm of the hairpin in Fig. 6. 

Polymerase chain reactions comprised 1 ng of supercoiled plasmid target 
DNA, 5 pmoles of each primer, 40 \iM each dNTP, and 2.5 units of GNAPTaq or 
DNAPStf, in a 50 jal solution of 10 mM Tris-Cl pH 8.3. The DNAP7a? reactions 

25 included 50 mM KC1 and 1.5 mM MgCl 2 . The temperature profile was 95°C for 

30 sec, 55°C for 1 min. and 72°C for 1 min., through 30 cycles. Ten percent of 
each reaction was analyzed by gel electrophoresis through 6% polyacrylamide 
(cross-linked 29:1) in a buffer of 45 mM Tris*Borate, pH 8.3, 1.4 mM EDTA. 
The results are shown in Figure 7. The expected product was made by 

30 DNAPStf (indicated simply as "S") but not by DNAP Taq (indicated as "T"). We 
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conclude that the 5' nuclease activity of DNAPTaq is responsible for the lack of 
amplification of this DNA sequence. 

To test whether the 5 ' unpaired nucleotides in the substrate region of this 
structured DNA are removed by DNAPTaq, the fate of the end-labeled 5' arm 
5 during four cycles of PCR was compared using the same two polymerases (Figure. 

8). The hairpin templates, such as the one described in Figure 6, were made using 
DNAPStf and a 32 P- 5' -end-labeled primer. The 5'-end of the DNA was released as 
a few large fragments by DNAPTaq but not by DNAPStf. The sizes of these 
fragments (based on their mobilities) show that they contain most or all of the 

1 0 unpaired 5 ' arm of the DNA. Thus, cleavage occurs at or near the base of the 

bifurcated duplex. These released fragments terminate with 3 ' OH groups, as 
evidenced by direct sequence analysis, and the abilities of the fragments to be 
extended by terminal deoxynucleotidyl transferase. 

Figures 9- 1 1 show the results of experiments designed to characterize the 

15 cleavage reaction catalyzed by DNAPTaq. Unless otherwise specified, the cleavage 

reactions comprised 0.01 pmoles of heat-denatured, end-labeled hairpin DNA (with 
the unlabeled complementary strand also present), 1 pmole primer (complementary 
to the 3' arm) and 0.5 units of DNAPTaq (estimated to be 0.026 pmoles) in a total 
volume of lOjal of 10 mM Tris-Cl, ph 8.5, 50 mM KC1 and 1.5 mM MgCl 2 . As 

20 indicated, some reactions had different concentrations of KC1, and the precise times 

and temperatures used in each experiment are indicated in the individual figures. 
The reactions that included a primer used the one shown in Figure 6 (SEQ ID 
NO: 17). In some instances, the primer was extended to the junction site by 
providing polymerase and selected nucleotides. 

25 Reactions were initiated at the final reaction temperature by the addition of 

either the MgCl 2 or enzyme. Reactions were stopped at their incubation 
temperatures by the addition of 8 jj.1 of 95% formamide with 20 mM EDTA and 
0.05% marker dyes. The T m calculations listed were made using the Oligo™ 
primer analysis software from National Biosciences, Inc. These were determined 

30 using 0.25 fj.M as the DNA concentration, at either 15 or 65 mM total salt (the 1.5 

mM MgCl 2 in all reactions was given the value of 1 5 mM salt for these 
calculations). 
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Figure 9 is an autoradiogram containing the results of a set of experiments 
and conditions on the cleavage site. Figure 9A is a determination of reaction 
components that enable cleavage. Incubation of 5 '-end-labeled hairpin DNA was 
for 30 minutes at 55°C, with the indicated components. The products were 
5 resolved by denaturing polyacrylamide gel electrophoresis and the lengths of the 

products, in nucleotides, are indicated. Figure 9B describes the effect of 
temperature on the site of cleavage in the absence of added primer. Reactions were 
incubated in the absence of KC1 for 10 minutes at the indicated temperatures. The 
lengths of the products, in nucleotides, are indicated. 
10 Surprisingly, cleavage by DNAP Tag requires neither a primer nor dNTPs 

(see Fig. 9A). Thus, the 5' nuclease activity can be uncoupled from 
polymerization. Nuclease activity requires magnesium ions, though manganese ions 
can be substituted, albeit with potential changes in specificity and activity. Neither 
zinc nor calcium ions support the cleavage reaction. The reaction occurs over a 
15 broad temperature range, from 25°C to 85°C, with the rate of cleavage increasing 

at higher temperatures. 

Still referring to Figure 9, the primer is not elongated in the absence of 
added dNTPs. However, the primer influences both the site and the rate of 
cleavage of the hairpin. The change in the site of cleavage (Fig. 9 A) apparently 
20 results from disruption of a short duplex formed between the arms of the DNA 

substrate. In the absence of primer, the sequences indicated by underlining in 
Figure 6 could pair, forming an extended duplex. Cleavage at the end of the 
extended duplex would release the 1 1 nucleotide fragment seen on the Fig. 9A 
lanes with no added primer. Addition of excess primer- (Fig. 9A, lanes 3 and 4) or 
25 incubation at an elevated temperature (Fig. 9B) disrupts the short extension of the 

duplex and results in a longer 5' arm and, hence, longer cleavage products. 

The location of the 3 ' end of the primer can influence the precise site of 
cleavage. Electrophoretic analysis revealed that in the absence of primer (Fig. 9B), 
cleavage occurs at the end of the substrate duplex (either the extended or shortened 
30 form, depending on the temperature) between the first and second base pairs. 

When the primer extends up to the base of the duplex, cleavage also occurs one 
nucleotide into the duplex. However, when a gap of four or six nucleotides exists 
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between the 3' end of the primer and the substrate duplex, the cleavage site is 
shifted four to six nucleotides in the 5' direction. 

Fig. 10 describes the kinetics of cleavage in the presence (Fig. 10A) or 
absence (Fig. 10B) of a primer oligonucleotide. The reactions were run at 55°C 
5 with either 50 mM KC1 (Fig. 10A) or 20 mM KC1 (Fig. 10B). The reaction 

products were resolved by denaturing polyacrylamide gel electrophoresis and the 
lengths of the products, in nucleotides, are indicated. "M", indicating a marker, is 
a 5' end-labeled 19-nt oligonucleotide. Under these salt conditions, Figs. 10A and 
10B indicate that the reaction appears to be about twenty times faster in the 
10 presence of primer than in the absence of primer. This effect on the efficiency 

may be attributable to proper alignment and stabilization of the enzyme on the 
substrate. 

The relative influence of primer on cleavage rates becomes much greater 
when both reactions are run in 50 mM KC1. In the presence of primer, the rate of 

15 cleavage increases with KC1 concentration, up to about 50 mM. However, 

inhibition of this reaction in the presence of primer is apparent at 100 mM and is 
complete at 150 mM KCL In contrast, in the absence of primer the rate is 
enhanced by concentration of KC1 up to 20 mM, but it is reduced at concentrations 
above 30 mM. At 50 mM KC1, the reaction is almost completely inhibited. The 

20 inhibition of cleavage by KC1 in the absence of primer is affected by temperature, 

being more pronounced at lower temperatures. 

Recognition of the 5' end of the arm to be cut appears to be an important 
feature of substrate recognition. Substrates that lack a free 5' end, such as circular 
Ml 3 DNA, cannot be cleaved under any conditions tested. Even with substrates 

25 having defined 5' arms, the rate of cleavage by UNAPTag is influenced by the 

length of the arm. In the presence of primer and 50 mM KC1, cleavage of a 5' 
extension that is 27 nucleotides long is essentially complete within 2 minutes at 
55°C. In contrast, cleavages of molecules with 5' arms of 84 and 188 nucleotides 
are only about 90% and 40% complete after 20 minutes. Incubation at higher 

30 temperatures reduces the inhibitory effects of long extensions indicating that 

secondary structure in the 5' arm or a heat-labile structure in the enzyme may 
inhibit the reaction. A mixing experiment, run under conditions of substrate 
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excess, shows that the molecules with long arms do not preferentially tie up the 
available enzyme in non-productive complexes. These results may indicate that the 
5' nuclease domain gains access to the cleavage site at the end of the bifurcated 
duplex by moving down the 5' arm from one end to the other. Longer 5' arms 
5 would be expected to have more adventitious secondary structures (particularly 

when KC1 concentrations are high), which would be likely to impede this 
movement. 

Cleavage does not appear to be inhibited by long 3' arms of either the 
substrate strand target molecule or pilot nucleic acid, at least up to 2 kilobases. At 

10 the other extreme, 3' arms of the pilot nucleic acid as short as one nucleotide can 

support cleavage in a primer-independent reaction, albeit inefficiently. Fully paired 
oligonucleotides do not elicit cleavage of DNA templates during primer extension. 

The ability of DNAP Tag to cleave molecules even when the complementary 
strand contains only one unpaired 3' nucleotide may be useful in optimizing allele- 

1 5 specific PCR. PCR primers that have unpaired 3 ' ends could act as pilot 

oligonucleotides to direct selective cleavage of unwanted templates during 
preincubation of potential template-primer complexes with DNAP Tag in the 
absence of nucleoside triphosphates. 

B. 5' Nuclease Activities Of Other DNAPs 

20 To determine whether other 5' nucleases in other DNAPs would be suitable 

for the present invention, an array of enzymes, several of which were reported in 
the literature to be free of apparent 5' nuclease activity, were examined. The 
ability of these other enzymes to cleave nucleic acids in a structure-specific manner 
was tested using the hairpin substrate shown in Fig. 6 under conditions reported to 

25 be optimal for synthesis by each enzyme. 

DNAPEcl and DNAP Klenow were obtained from Promega Corporation; 
the DNAP of Pyrococcus furious ["Pfu", Bargseid et al. 9 Strategies 4:34 (1991)] 
was from Strategene; the DNAP of Thermococcus litoralis ["Tli", Vent™(exo-), 
Perler et al. 9 Proc. Natl. Acad. Sci. USA 89:5577 (1992)] was from New England 

30 Biolabs; the DNAP of Thermus flavus ["Tfl", Kaledin et a!., Biokhimiya 46:1576 

(1981)] was from Epicentre Technologies; and the DNAP of Thermus thermophilus 
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[ M Tth M , Carballeira et al. 9 Biotechniques 9:276 (1990); Myers et ah, Biochem. 
30:7661 (1991)] was from U.S. Biochemicals. 

0.5 units of each DNA polymerase was assayed in a 20 jal reaction, using 
either the buffers supplied by the manufacturers for the primer- dependent reactions, 
5 or 10 mM Tris*Cl, pH 8.5, 1.5 mM MgCl 2 , and 20mM KCL Reaction mixtures 

were at held 72°C before the addition of enzyme. 

Fig. 1 1 is an autoradiogram recording the results of these tests. Fig. 1 1 A 
demonstrates reactions of endonucleases of DNAPs of several thermophilic bacteria. 
The reactions were incubated at 55°C for 10 minutes in the presence of primer or 
10 at 72°C for 30 minutes in the absence of primer, and the products were resolved by 

denaturing polyacrylamide gel electrophoresis. The lengths of the products, in 
nucleotides, are indicated. Fig. 11B demonstrates endonucleolytic cleavage by the 
5' nuclease of DNAPEcl. The DNAPEcl and DNAP Klenow reactions were 
incubated for 5 minutes at 37°C. Note the light band of cleavage products of 25 
1 5 and 1 1 nucleotides in the DNAPEcl lanes (made in the presence and absence of 

primer, respectively). Fig. 7B also demonstrates UNAPTaq reactions in the 
presence (+) or absence (-) of primer. These reactions were run in 50 mM and 20 
mM KC1, respectively, and were incubated at 55°C for 10 minutes. 

Referring to Fig. 11 A, DNAPs from the eubacteria Thermus thermophilics 
20 and Thermus flavus cleave the substrate at the same place as DNAP7a#, both in the 

presence and absence of primer. In contrast, DNAPs from the archaebacteria 
Pyrococcus furiosus and Thermococcus litoralis are unable to cleave the substrates 
endonucleolytically. The DNAPs from Pyrococcus furious and Thermococcus 
litoralis share little sequence homology with eubacterial enzymes (Ito et ah , Nuch 
25 Acids Res. 19:4045 (1991); Mathur et ah, Nuch Acids. Res. 19:6952 (1991); see 

also Perler et ah). Referring to Fig. 11B, DNAPEcl also cleaves the substrate, but 
the resulting cleavage products are difficult to detect unless the 3 ' exonuclease is 
inhibited. The amino acid sequences of the 5' nuclease domains of DNAPEcl and 
DNAPTaq are about 38% homologous (Gelfand, supra). 
30 The 5' nuclease domain of DNAP Taq also shares about 19% homology with 

the 5' exonuclease encoded by gene 6 of bacteriophage T7 [Dunn et ah, J. Moh 
Biol. 166:477 (1983)]. This nuclease, which is not covalently attached to a DNAP 
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polymerization domain, is also able to cleave DNA endonucleolytically, at a site 
similar or identical to the site that is cut by the 5' nucleases described above, in the 
absence of added primers. 

C. Transcleavage * 

5 The ability of a 5' nuclease to be directed to cleave efficiently at any 

■ * 

specific sequence was demonstrated in the following experiment. A partially 
complementary oligonucleotide termed a "pilot oligonucleotide" was hybridized to 
sequences at the desired point of cleavage. The non-complementary part of the 
pilot oligonucleotide provided a structure analogous to the 3' arm of the template 

10 (see Fig. 6), whereas the 5' region of the substrate strand became the 5' arm. A 

primer was provided by designing the 3 ' region of the pilot so that it would fold on 
itself creating a short hairpin with a stabilizing tetra-loop [Antao et al., Nucl Acids 
Res. 19:5901 (1991)]. Two pilot oligonucleotides are shown in Fig. 12A. 
Oligonucleotides 19-12 (SEQ ID NO:18), 30-12 (SEQ ID NO:19) and 30-0 (SEQ 

15 ID NO:40) are 31, 42 or 30 nucleotides long, respectively. However, 

oligonucleotides 19-12 (SEQ ID NO:18) and 34-19 (SEQ ID NO:19) have only 19 
and 30 nucleotides, respectively, that are complementary to different sequences in 
the substrate strand. The pilot oligonucleotides are calculated to melt off their 
complements at about 50°C (19-12) and about 75°C (30-12). Both pilots have 12 

20 nucleotides at their 3' ends, which act as 3' arms with base-paired primers attached. 

To demonstrate that cleavage could be directed by a pilot oligonucleotide, 
we incubated a single-stranded target DNA with DNAP7a<7 in the presence of two 
potential pilot oligonucleotides. The transcleavage reactions, where the target and 
pilot nucleic acids are not covalently linked, includes 0.01 pmoles of single end- 

25 labeled substrate DNA, 1 unit of DNAPTac? and 5 pmoles of pilot oligonucleotide 

in a volume of 20 jj.1 of the same buffers. These components were combined 
during a one minute incubation at 95°C, to denature the PCR-generated double- 
stranded substrate DNA, and the temperatures of the reactions were then reduced to 
their final incubation temperatures. Oligonucleotides 30-12 and 19-12 can 

30 hybridize to regions of the substrate DNAs that are 85 and 27 nucleotides from the 

5' end of the targeted strand. 
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Figure 21 shows the complete 206-mer sequence (SEQ ID NO:32). The 
206-mer was generated by PCR . The M13/pUC 24-mer reverse sequencing (-48) 
primer and the M13/pUC sequencing (-47) primer from New England Biolabs 
(catalogue nos. 1233 and 1224 respectively) were used (50 pmoles each) with the 
5 pGEM3z(f+) plasmid vector (Promega Corp.) as template (10 ng) containing the 

target sequences. The conditions for PCR were as follows: 50 jaM of each dNTP 
and 2.5 units of Taq DNA polymerase in 100 fxl of 20 mM Tris-Cl, pH 8.3, 1.5 
mM MgCl 2 , 50 mM KC1 with 0.05% Tween-20 and 0.05% NP-40. Reactions were 
cycled 35 times through 95°C for 45 seconds, 63°C for 45 seconds, then 72°C for 

10 75 seconds. After cycling, reactions were finished off with an incubation at 72°C 

for 5 minutes. The resulting fragment was purified by electrophoresis through a 
6% polyacrylamide gel (29:1 cross link) in a buffer of 45 mM Tris-Borate, pH 8.3, 
1 .4 mM EDTA, visualized by ethidium bromide staining or autoradiography, 
excised from the gel, eluted by passive diffusion, and concentrated by ethanol 

1 5 precipitation. 

Cleavage of the substrate DNA occurred in the presence of the pilot 
oligonucleotide 19-12 at 50°C (Fig. 12B, lanes 1 and 7) but not at 75°C (lanes 4 
and 10). In the presence of oligonucleotide 30-12 cleavage was observed at both 
temperatures. Cleavage did not occur in the absence of added oligonucleotides 

20 (lanes 3, 6 and 12) or at about 80°C even though at 50°C adventitious structures in 

the substrate allowed primer-independent cleavage in the absence of KC1 (Fig. 12B, 
lane 9). A non-specific oligonucleotide with no complementarity to the substrate 
DNA did not direct cleavage at 50°C, either in the absence or presence of 50 mM 
KC1 (lanes 13 and 14). Thus, the specificity of the cleavage reactions can be 

25 controlled by the extent of complementarity to the substrate and by the conditions 

of incubation. 
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D. Cleavage Of RNA 

An shortened RNA version of the sequence used in the transcleavage 
experiments discussed above was tested for its ability to serve as a substrate in the 
reaction. The RNA is cleaved at the expected place, in a reaction that is dependent 
5 upon the presence of the pilot oligonucleotide. The RNA substrate, made by T7 

RNA polymerase in the presence of [ct- 32 P]UTP, corresponds to a truncated version 
of the DNA substrate used in Figure 12B. Reaction conditions were similar to 
those in used for the DNA substrates described above, with 50 mM KC1; incubation 
was for 40 minutes at 55 °C. The pilot oligonucleotide used is termed 30-0 (SEQ 
10 ID NO:20) and is shown in Fig. 13 A. 

The results of the cleavage reaction is shown in Figure 13B. The reaction 
was run either in the presence or absence of DNAP Tag or pilot oligonucleotide as 
indicated in Figure 13B. 

Strikingly, in the case of RNA cleavage, a 3 ' arm is not required for the 
15 pilot oligonucleotide. It is very unlikely that this cleavage is due to previously 

described RNaseH, which would be expected to cut the RNA in several places 
along the 30 base-pair long RNA-DNA duplex. The 5' nuclease of UNAPTag is a 
structure-specific RNaseH that cleaves the RNA at a single site near the 5' end of 
the heteroduplexed region. 

20 It is surprising that an oligonucleotide lacking a 3' arm is able to act as a 

pilot in directing efficient cleavage of an RNA target because such oligonucleotides 
are unable to direct efficient cleavage of DNA targets using native DNAPs. 
However, some 5' nucleases of the present invention (for example, clones E, F and 
G of Figure 4) can cleave DNA in the absence of a 3 5 -arm. In other words, a non- 
25 extendable cleavage structure is not required for specific cleavage with some 5' 
nucleases of the present invention derived from thermostable DNA polymerases. 

We tested whether cleavage of an RNA template by UNAPTag in the 
presence of a fully complementary primer could help explain why UNAPTag is 
unable to extend a DNA oligonucleotide on an RNA template, in a reaction 
30 resembling that of reverse transcriptase. Another thermophilic DNAP, DNAPTth, 

is able to use RNA as a template, but only in the presence of Mn++, so we 
predicted that this enzyme would not cleave RNA in the presence of this cation. 
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Accordingly, we incubated an RNA molecule with an appropriate pilot 
oligonucleotide in the presence of UN AP Tag or DNAPTth, in buffer containing 
either Mg++ or Mn++. As expected, both enzymes cleaved the RNA in the 
presence of Mg++. However, DNAPTaq, but not DNAPTth, degraded the RNA in 
5 the presence of Mn++. We conclude that the 5' nuclease activities of many 

DNAPs may contribute to their inability to use RNA as templates. 

EXAMPLE 2 

Generation Of 5' Nucleases From Thermostable DNA Polymerases 

Thermostable DNA polymerases were generated which have reduced 

1 0 synthetic activity, an activity that is an undesirable side-reaction during DNA 

cleavage in the detection assay of the invention, yet have maintained thermostable 
nuclease activity. The result is a thermostable polymerase which cleaves nucleic 
acids DNA with extreme specificity. 

Type A DNA polymerases from eubacteria of the genus Thermus share 

1 5 extensive protein sequence identity (90% in the polymerization domain, using the 

Lipman-Pearson method in the DNA analysis software from DNAStar, WI) and 
behave similarly in both polymerization and nuclease assays. Therefore, we have 
used the genes for the DNA polymerase of Thermus aquaticus (DNAP7a#) and 
Thermus flavus (DNAPTfl) as representatives of this class. Polymerase genes from 

20 other eubacterial organisms, such as Thermus thermophilus, Thermus sp., 

Thermotoga maritime Thermosipho africanus and Bacillus stearothermophilus are 
equally suitable. The DNA polymerases from these thermophilic organisms are 
capable of surviving and performing at elevated temperatures, and can thus be used 
in reactions in which temperature is used as a selection against non-specific 

25 hybridization of nucleic acid strands. 

The restriction sites used for deletion mutagenesis, described below, were 
chosen for convenience. Different sites situated with similar convenience are 
available in the Thermus thermophilus gene and can be used to make similar 
constructs with other Type A polymerase genes from related organisms. 
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A. Creation Of 5' Nuclease Constructs 
1. Modified VNAPTaq Genes 

The first step was to place a modified gene for the Taq DNA polymerase on 
a plasmid under control of an inducible promoter. The modified Taq polymerase 
5 gene was isolated as follows: The Taq DNA polymerase gene was amplified by 

polymerase chain reaction from genomic DNA from Thermus aquaticus, strain YT- 
1 (Lawyer et al., supra), using as primers the oligonucleotides described in SEQ ID 
NOS:13-14. The resulting fragment of DNA has a recognition sequence for the 
restriction endonuclease EcoRI at the 5' end of the coding sequence and a Bglll 

10 sequence at the 3' end. Cleavage with Bglll leaves a 5' overhang or "sticky end" 

that is compatible with the end generated by BamHI. The PCR-amplified DNA 
was digested with EcoRI and BamHI. The 2512 bp fragment containing the coding 
region for the polymerase gene was gel purified and then ligated into a plasmid 
which contains an inducible promoter. 

15 In one embodiment of the invention, the pTTQ18 vector, which contains the 

hybrid trp-lac (tac) promoter, was used [MJ.R. Stark, Gene 5:255 (1987)] and 
shown in Fig. 14. The tac promoter is under the control of the E. coli lac 
repressor. Repression allows the synthesis of the gene product to be suppressed 
until the desired level of bacterial growth has been achieved, at which point 

20 repression is removed by addition of a specific inducer, isopropyl-fi-D- 

thiogalactopyranoside (IPTG). Such a system allows the expression of foreign 
proteins that may slow or prevent growth of transformants. 

Bacterial promoters, such as tac, may not be adequately suppressed when 
they are present on a multiple copy plasmid. If a highly toxic protein is placed 

25 under control of such a promoter, the small amount of expression leaking through 

can be harmful to the bacteria. In another embodiment of the invention, another 
option for repressing synthesis of a cloned gene product was used. The non- 
bacterial promoter, from bacteriophage T7, found in the plasmid vector series pET- 
3 was used to express the cloned mutant Taq polymerase genes [Fig. 15; Studier 

30 and Moffatt, J. Mol Biol 189:113 (1986)]. This promoter initiates transcription 

only by T7 RNA polymerase. In a suitable strain, such as BL21(DE3)pLYS, the 
gene for this RNA polymerase is carried on the bacterial genome under control of 
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the lac operator. This arrangement has the advantage that expression of the 
multiple copy gene (on the plasmid) is completely dependent on the expression of 
T7 RNA polymerase, which is easily suppressed because it is present in a single 
copy. 

5 For ligation into the pTTQ18 vector (Fig. 14), the PGR product DNA 

containing the Tag polymerase coding region (mutTag, clone 4B, SEQ ID NO:21) 
was digested with EcoRI and Bglll and this fragment was ligated under standard 
"sticky end" conditions [Sambrook et al Molecular Cloning, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor (1989)] into the EcoRI and BamHI sites of 

10 the plasmid vector pTTQ18. Expression of this construct yields a translational 

fusion product in which the first two residues of the native protein (Met-Arg) are 
replaced by three from the vector (Met-Asn-Ser), but the remainder of the natural 
protein would not change. The construct was transformed into the JM109 strain of 
E. coli and the transformants were plated under incompletely repressing conditions 

1 5 that do not permit growth of bacteria expressing the native protein. These plating 

conditions allow the isolation of genes containing pre-existing mutations, such as 
those that result from the infidelity of Tag polymerase during the amplification 
process. 

Using this amplification/selection protocol, we isolated a clone (depicted in 
20 Fig. 4B) containing a mutated Tag polymerase gene (mutTag, clone 4B). The 

mutant was first detected by its phenotype, in which temperature-stable 5' nuclease 
activity in a crude cell extract was normal, but polymerization activity was almost 
absent (approximately less than 1% of wild type Tag polymerase activity). 

DNA sequence analysis of the recombinant gene showed that it had changes 
25 in the polymerase domain resulting in two amino acid substitutions: an A to G 

change at nucleotide position 1394 causes a Glu to Gly change at amino acid 
position 465 (numbered according to the natural nucleic and amino acid sequences, 
SEQ ID NOS:l and 4) and another A to G change at nucleotide position 2260 
causes a Gin to Arg change at amino acid position 754. Because the Gin to Gly 
30 mutation is at a nonconserved position and because the Glu to Arg mutation alters 

an amino acid that is conserved in virtually all of the known Type A polymerases, 
this latter mutation is most likely the one responsible for curtailing the synthesis 
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activity of this protein. The nucleotide sequence for the Fig. 4B construct is given 
in SEQ ID NO:21. 

Subsequent derivatives of DNAP7a^ constructs were made from the mutTag 
gene, thus, they all bear these amino acid substitutions in addition to their other 
5 alterations, unless these particular regions were deleted. These mutated sites are 

indicated by black boxes at these locations in the diagrams in Fig. 4. All 
constructs except the genes shown in Figures 4E ? F and G were made in the 
pTTQ18 vector. 

The cloning vector used for the genes in Figs. 4E and F was from the 

10 commercially available pET-3 series, described above. Though this vector series 

has only a BamHI site for cloning downstream of the T7 promoter, the series 
contains variants that allow cloning into any of the three reading frames. For 
cloning of the PCR product described above, the variant called pET-3c was used 
(Fig 15). The vector was digested with BamHI, dephosphorylated with calf 

1 5 intestinal phosphatase, and the sticky ends were filled in using the Klenow 

fragment of DNAPEcl and dNTPs. The gene for the mutant Tag DNAP shown in 
Fig. 4B (mutTag, clone 4B) was released from pTTQ18 by digestion with EcoRI 
and Sail, and the "sticky ends" were filled in as was done with the vector. The 
fragment was ligated to the vector under standard blunt-end conditions (Sambrook 

20 et ah , Molecular Cloning, supra), the construct was transformed into the 

BL21(DE3)pLYS strain of E, coli, and isolates were screened to identify those that 
were ligated with the gene in the proper orientation relative to the promoter. This 
construction yields another translational fusion product, in which the first two 
amino acids of DNAPTaq (Met-Arg) are replaced by 1 3 from the vector plus two 

25 from the PCR primer (Met-Ala-Ser-Met-Thr-Gly-Gly-Gln-Gln-Met-Gly-Arg-Ile- 

Asn-Ser) (SEQ ID NO:29). 

Our goal was to generate enzymes that lacked the ability to synthesize 

DNA, but retained the ability to cleave nucleic acids with a 5' nuclease activity. 

The act of primed, templated synthesis of DNA is actually a coordinated series of 

30 events, so it is possible to disable DNA synthesis by disrupting one event while not 

affecting the others. These steps include, but are not limited to, primer recognition 

and binding, dNTP binding and catalysis of the inter-nucleotide phosphodiester 
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bond. Some of the amino acids in the polymerization domain of DNAPEcI have 
been linked to these functions, but the precise mechanisms are as yet poorly 
defined. 

One way of destroying the polymerizing ability of a DNA polymerase is to 
5 delete all or part of the gene segment that encodes that domain for the protein, or 

to otherwise render the gene incapable of making a complete polymerization 
domain. Individual mutant enzymes may differ from each other in stability and 
solubility both inside and outside cells. For instance, in contrast to the 5' nuclease 
domain of DNAPEcI, which can be released in an active form from the 
10 polymerization domain by gentle proteolysis [Setlow and Kornberg, J. Biol Chem. 

247:232 (1972)], the Thermus nuclease domain, when treated similarly, becomes 
less soluble and the cleavage activity is often lost. 

Using the mutant gene shown in Fig. 4B as starting material, several 
deletion constructs were created. All cloning technologies were standard 
15 (Sambrook et al, supra) and are summarized briefly, as follows: 

Fig. 4C: The mutTag construct was digested with PstI, which cuts once 
within the polymerase coding region, as indicated, and cuts immediately 
downstream of the gene in the multiple cloning site of the vector. After release of 
the fragment between these two sites, the vector was re-ligated, creating an 894- 
20 nucleotide deletion, and bringing into frame a stop codon 40 nucleotides 

downstream of the junction. The nucleotide sequence of this 5' nuclease (clone 
4C) is given in SEQ ID NO:9. 

Fig. 4D: The mutTag construct was digested with Nhel, which cuts once in 
the gene at position 2047. The resulting four-nucleotide 5' overhanging ends were 
25 filled in, as described above, and the blunt ends were re-ligated. The resulting 

four-nucleotide insertion changes the reading frame and causes termination of 
translation ten amino acids downstream of the mutation. The nucleotide sequence 
of this 5' nuclease (clone 4D) is given in SEQ ID NO: 10. 

Fig. 4E: The entire mutTag gene was cut from pTTQ18 using EcoRI and 
30 Sail and cloned into pET-3c, as described above. This clone was digested with 

BstXI and Xcml, at unique sites that are situated as shown in Fig. 4E. The DNA 
was treated with the Klenow fragment of DNAPEcI and dNTPs, which resulted in 



-51- 



WO 94/29482 



PCT/US94/06253 



the 3' overhangs of both sites being trimmed to blunt ends. These blunt ends were 
ligated together, resulting in an out-of- frame deletion of 1540 nucleotides. An in- 
frame termination codon occurs 18 triplets past the junction site. The nucleotide 
sequence of this 5 5 nuclease (clone 4E) is given in SEQ ID NO:ll 5 with the 
5 appropriate leader sequence given in SEQ ID NO:30. It is also referred to as 

Cleavase™ BX. 

Fig. 4F: The entire mutTag gene was cut from pTTQ18 using EcoRI and 
Sail and cloned into pET-3c, as described above. This clone was digested with 
BstXI and amHI, at unique sites that are situated as shown in the diagram. The 

10 DNA was treated with the Klenow fragment of DNAPEcl and dNTPs, which 

resulted in the 3' overhang of the BstX I site being trimmed to a blunt end, while 
the 5' overhang of the Bam HI site was filled in to make a blunt end. These ends 
were ligated together, resulting in an in-frame deletion of 903 nucleotides. The 
nucleotide sequence of the 5' nuclease (clone 4F) is given in SEQ ID NO: 12. It is 

15 also referred to as Cleavase™ BB. 

Fig.4G: This polymerase is a variant of that shown in Figure 4E. It was 
cloned in the plasmid vector pET-21 (Novagen). The non-bacterial promoter from 
bacteriophage T7, found in this vector, initiates transcription only by T7 RNA 
polymerase. See Studier and Moffatt, supra. In a suitable strain, such as 

20 (DES)pLYS, the gene for this RNA polymerase is carried on the bacterial genome 

under control of the lac operator. This arrangement has the advantage that 
expression of the multiple copy gene (on the plasmid) is completely dependent on 
the expression of T7 RNA polymerase, which is easily suppressed because it is 
present in a single copy. Because the expression of these mutant genes is under 

25 this tightly controlled promoter, potential problems of toxicity of the expressed 

proteins to the host cells are less of a concern. 

The pET-21 vector also features a M His*Tag", a stretch of six consecutive 
histidine residues that are added on the carboxy terminus of the expressed proteins. 
The resulting proteins can then be purified in a single step by metal chelation 

30 chromatography, using a commerically available (Novagen) column resin with 

immobilized Ni ++ ions. The 2.5 ml columns are reusable, and can bind up to 20 
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mg of the target protein under dative or denaturing (guanidine*HCl or urea) 
conditions. 

E, coli (DES)pLYS cells are transformed with the constructs described 
above using standard transformation techniques, and used to inoculate a standard 
5 growth medium {e.g., Luria-Bertani broth). Production of T7 RNA polymerase is 

induced during log phase growth by addition of IPTG and incubated for a further 
12 to 17 hours. Aliquots of culture are removed both before and after induction 
and the proteins are examined by SDS-PAGE. Staining with Coomassie Blue 
allows visualization of the foreign proteins if they account for about 3-5% of the 

10 cellular protein and do not co-migrate with any of the major protein bands. 

Proteins that co-migrate with major host protein must be expressed as more than 
10% of the total protein to be seen at this tage of analysis. 

Some mutant proteins are sequestered by the cells" into inclusion bodies. 
These are granules that form in the cytoplasm when bacteria are made to express 

1 5 high levels of a foreign protein, and they can be purified from a crude lysate, and 

analyzed by SDS-PAGE to determine their protein content. If the cloned protein is 
found in the inclusion bodies, it must be released to assay the cleavage and 
polymerase activities. Different methods of solubilization may be appropriate for 
different proteins, and a variety of methods are known. See e.g. Builder & Ogez, 

20 U.S. Patent No. 4,511,502 (1985); Olson, U.S. Patent No. 4,518,526 (1985); Olson 

& Pai, U.S. Patent No. 4,511,503 (1985); Jones et aU U.S. Patent No. 4,512,922 
(1985), all of which are hereby incorporated by reference. 

The solubilized protein is then purified on the Ni ++ column as described 
above, following the manufacturers instructions (Novagen). The washed proteins 

25 are eluted from the column by a combination of imidazole competitor (1 M) and 

high salt (0.5 M NaCl), and dialyzed to exchange the buffer and to allow denature 
proteins to refold. Typical recoveries result in approximately 20 jag of specific 
protein per ml of starting culture. The DNAP mutant is referred to as Cleavase™ 
BN and the sequence is given in SEQ ID NO:31. 
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2. Modified DNAPTfl Gene 

The DNA polymerase gene of Thermus flavus was isolated from the 'T. 
flavus" AT-62 strain obtained from the American Type Tissue Collection (ATCC 
33923). This strain has a different restriction map then does the T. flavus strain 
5 used to generate the sequence published by Akhmetzjanov and Vakhitov, supra. 

The published sequence is listed as SEQ ID NO:2. No sequence data has been 
published for the DNA polymerase gene from the AT-62 strain of T. flavus. 

Genomic DNA from T. flavus was amplified using the same primers used to 
amplify the T. aquaticus DNA polymerase gene (SEQ ID NOS:13-14). The 

10 approximately 2500 base pair PGR fragment was digested with EcoRI and BamHI. 

The over-hanging ends were made blunt with the Klenow fragment of DNAPEcl 
and dNTPs. The resulting approximately 1 800 base pair fragment containing the 
coding region for the N-terminus was ligated into pET-3c, as described above. 
This construct, clone 5B, is depicted in Fig. 5B. The wild type T. flavus DNA 

15 polymerase gene is depicted in Fig. 5A. The 5B clone has the same leader amino 

acids as do the DNAPTaq clones 4E and F which were cloned into pET-3c; it is 
not known precisely where translation termination occurs, but the vector has a 
strong transcription termination signal immediately downstream of the cloning site. 

B. Growth And Induction Of Transformed Cells 

20 Bacterial cells were transformed with the constructs described above using 

standard transformation techniques and used to inoculate 2 mis of a standard 
growth medium (e.g., Luria-Bertani broth). The resulting cultures were incubated 
as appropriate for the particular strain used, and induced if required for a particular 
expression system. For all of the constructs depicted in Figs. 4 and 5, the cultures 

25 were grown to an optical density (at 600nm wavelength) of 0.5 OD. 

To induce expression of the cloned genes, the cultures were brought to a 
final concentration of 0.4 mM IPTG and the incubations were continued for 12 to 

pi 

17 hours. 50 jtd aliquots of each culture were removed both before and after 
induction and were combined with 20 /xl of a standard gel loading buffer for 
30 sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). 

Subsequent staining with Coomassie Blue (Sambrook et al , supra) allows 
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visualization of the foreign proteins if they account for about 3-5% of the cellular 
protein and do not co-migrate with any of the major E. coli protein bands. 
Proteins that do co-migrate with a major host protein must be expressed as more 
than 10% of the total protein to be seen at this stage of analysis. 

5 C. Heat Lysis And Fractionation 

Expressed thermostable proteins, i.e., the 5' nucleases, were isolated by 
heating crude bacterial cell extracts to cause denaturation and precipitation of the 
less stable E. coli proteins. The precipitated E. coli proteins were then, along with 
other cell debris, removed by centrifugation. 1.7 mis of the culture were pelleted 

10 by microcentrifugation at 12,000 to 14,000 rpm for 30 to 60 seconds. After 

removal of the supernatant, the cells were resuspended in 400 jtl of buffer A (50 
mM Tris-HCl, pH 7.9, 50 mM dextrose, 1 mM EDTA), re-centrifuged, then 
resuspended in 80 p\ of buffer A with 4mg/ml lysozyme. The cells were incubated 
at room temperature for 15 minutes, then combined with 80 ix\ of buffer B (10 mM 

15 Tris-HCl, pH 7.9, 50 mM KC1, 1 mM EDTA, 1 mM PMSF, 0.5% Tween-20, 

0.5% Nonidet-P40). 

This mixture was incubated at 75 °C for 1 hour to denature and precipitate 
the host proteins. This cell extract was centrifuged at 14,000 rpm for 15 minutes at 
4°C, and the supernatant was transferred to a fresh tube. An aliquot of 0.5 to 1 /xl 

20 of this supernatant was used directly in each test reaction, and the protein content 

of the extract was determined by subjecting 7 fx\ to electrophoretic analysis, as 
above. The native recombinant Tag DNA polymerase [Englke, Anal. Biochem 
191:396 (1990)], and the double point mutation protein shown in Fig. 4B are both 
soluble and active at this point. 

25 The foreign protein may not be detected after the heat treatments due to 

sequestration of the foreign protein by the cells into inclusion bodies. These are 
granules that form in the cytoplasm when bacteria are made to express high levels 
of a foreign protein, and they can be purified from a crude lysate, and analyzed 
SDS PAGE to determine their protein content. Many methods have been described 

30 in the literature, and one approach is described below. 
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D. Isolation And Solubilization Of Inclusion Bodies 

A small culture was grown and induced as described above. A 1.7 ml 
aliquot was pelleted by brief centrifugation, and the bacterial cells were 
resuspended in 100 fil of Lysis buffer (50 mM Tris-HCl, pH 8.0, 1 mM EDTA, 
5 100 mM NaCl). 2.5 /d of 20 mM PMSF were added for a final concentration of 

0.5 mM, and lysozyme was added to a concentration of 1.0 mg/ml. The cells were 
incubated at room temperature for 20 minutes, deoxycholic acid was added to 
1 mg/ml (1 /xl of 100 mg/ml solution), and the mixture was further incubated at 
37°C for about 15 minutes or until viscous. DNAse I was added to 10 ^ig/ml and 
10 the mixture was incubated at room temperature for about 30 minutes or until it was 

no longer viscous. 

From this mixture the inclusion bodies were collected by centrifugation at 
14,000 rpm for 15 minutes at 4°C, and the supernatant was discarded. The pellet 
was resuspended in 100 p\ of lysis buffer with lOmM EDTA (pH 8.0) and 0.5% 

15 Triton X-100. After 5 minutes at room temperature, the inclusion bodies were 

pelleted as before, and the supernatant was saved for later analysis. The inclusion 
bodies were resuspended in 50 pi of distilled water, and 5 p\ was combined with 
SDS gel loading buffer (which dissolves the inclusion bodies) and analyzed 
electrophoretically, along with an aliquot of the supernatant. 

20 If the cloned protein is found in the inclusion bodies, it may be released to 

assay the cleavage and polymerase activities and the method of solubilization must 
be compatible with the particular activity. Different methods of solubilization may 
be appropriate for different proteins, and a variety of methods are discussed in 
Molecular Cloning (Sambrook et ai, supra). The following is an adaptation we 

25 have used for several of our isolates. 

20 fil of the inclusion body-water suspension were pelleted by centrifugation 
at 14,000 rpm for 4 minutes at room temperature, and the supernatant was 
discarded. To further wash the inclusion bodies, the pellet was resuspended in 20/xl 
of lysis buffer with 2M urea, and incubated at room temperature for one hour. The 

30 washed inclusion bodies were then resuspended in 2 /d of lysis buffer with 8M 

urea; the solution clarified visibly as the inclusion bodies dissolved. Undissolved 
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debris was removed by centrifugation at 14,000 rpm for 4 minutes at room 
temperature, and the extract supernatant was transferred to a fresh tube. 

To reduce the urea concentration, the extract was diluted into KH 2 P0 4 . A 
fresh tube was prepared containing 180 /zl of 50 mM KH 2 P0 4 , pH 9.5, 1 mM 
5 EDTA and 50 mM NaCl. A 2 /xl aliquot of the extract was added and vortexed 

briefly to mix. This step was repeated until all of the extract had been added for a 
total of 10 additions. The mixture was allowed to sit at room temperature for 15 
minutes, during which time some precipitate often forms. Precipitates were 
removed by centrifugation at 14,000 rpm, for 15 minutes at room temperature, and 

10 the supernatant was transferred to a fresh tube. To the 200 fjd of protein in the 

KH 2 P0 4 solution, 140-200 /xl of saturated (NH 4 ) 2 S0 4 were added, so that the 
resulting mixture was about 41% to 50% saturated (NH 4 ) 2 S0 4 . The mixture was 
chilled on ice for 30 minutes to allow the protein to precipitate, and the protein was 
then collected by centrifugation at 14,000 rpm, for 4 minutes at room temperature. 

15 The supernatant was discarded, and the pellet was dissolved in 20 fil Buffer C (20 

mM HEPES, pH 7.9, 1 mM EDTA, 0.5% PMSF, 25 mM KC1 and 0.5 % each of 
Tween-20 and Nonidet P 40). The protein solution was centrifuged again for 4 
minutes to pellet insoluble materials, and the supernatant was removed to a fresh 
tube. The protein contents of extracts prepared in this manner were visualized by 

20 resolving 1-4 fil by SDS-PAGE; 0.5 to 1 /xl of extract was tested in the cleavage 

and polymerization assays as described. 



E. Protein Analysis For Presence Of Nuclease And 
Synthetic Activity 

The 5 5 nucleases described above and shown in Figs. 4 and 5 were analyzed 
25 by the following methods. 



1. Structure Specific Nuclease Assay 

A candidate modified polymerase is tested for 5' nuclease activity by 
examining its ability to catalyze structure-specific cleavages. By the term "cleavage 
structure" as used herein, is meant a nucleic acid structure which is a substrate for 
30 cleavage by the 5' nuclease activity of a DNAP. 
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The polymerase is exposed to test complexes that have the structures shown 
in Fig. 16. Testing for 5' nuclease activity involves three reactions: 1) a primer- 
directed cleavage (Fig. 16B) is performed because it is relatively insensitive to 
variations in the salt concentration of the reaction and can, therefore, be performed 
5 in whatever solute conditions the modified enzyme requires for activity; this is 

generally the same conditions preferred by unmodified polymerases; 2) a similar 
primer-directed cleavage is performed in a buffer which permits primer-independent 
cleavage, /. e. , a low salt buffer, to demonstrate that the enzyme is viable under 
these conditions; and 3) a primer-independent cleavage (Fig. 1 6A) is performed in 

10 the same low salt buffer. 

The bifurcated duplex is formed between a substrate strand and a template 
strand as shown in Fig. 16. By the term "substrate strand" as used herein, is meant 
that strand of nucleic acid in which the cleavage mediated by the 5' nuclease 
activity occurs. The substrate strand is always depicted as the top strand in the 

15 bifurcated complex which serves as a substrate for 5' nuclease cleavage (Fig. 16). 

By the term "template strand" as used herein, is meant the strand of nucleic acid 
which is at least partially complementary to the substrate strand and which anneals 
to the substrate strand to form the cleavage structure. The template strand is 
always depicted as the bottom strand of the bifurcated cleavage structure (Fig. 16). 

20 If a primer (a short oligonucleotide of 19 to 30 nucleotides in length) is added to 

the complex, as when primer-dependent cleavage is to be tested, it is designed to 
anneal to the 3' arm of the template strand (Fig. 16B). Such a primer would be 
extended along the template strand if the polymerase used in the reaction has 
synthetic activity. 

25 The cleavage structure may be made as a single hairpin molecule, with the 

3' end of the target and the 5' end of the pilot joined as a loop as shown in Fig. 
16E. A primer oligonucleotide complementary to the 3' arm is also required for 
these tests so that the enzyme's sensitivity to the presence of a primer may be 
tested. 

30 Nucleic acids to be used to form test cleavage structures can be chemically 

synthesized, or can be generated by standard recombinant DNA techniques. By the 
latter method, the hairpin portion of the molecule can be created by inserting into a 
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cloning vector duplicate copies of a short DNA segment, adjacent to each other but 
in opposing orientation. The double-stranded fragment encompassing this inverted 
repeat, and including enough flanking sequence to give short (about 20 nucleotides) 
unpaired 5' and 3' arms, can then be released from the vector by restriction 
5 enzyme digestion, or by PGR performed with an enzyme lacking a 5' exonuclease 

(e.g., the Stoffel fragment of Amplitaq™ DNA polymerase, Vent™ DNA 
polymerase). 

The test DNA can be labeled on either end, or internally, with either a 
radioisotope, or with a non-isotopic tag. Whether the hairpin DNA is a synthetic 

1 0 single strand or a cloned double strand, the DNA is heated prior to use to melt all 

duplexes. When cooled on ice, the structure depicted in Fig. 16E is formed, and is 
stable for sufficient time to perform these assays. 

To test for primer-directed cleavage (Reaction l),'a detectable quantity of 
the test molecule (typically 1-100 fmol of 32 P-labeled hairpin molecule) and a 10 to 

15 1 00-fold molar excess of primer are placed in a buffer known to be compatible 

with the test enzyme. For Reaction 2, where primer-directed cleavage is performed 
under condition which allow primer-independent cleavage, the same quantities of 
molecules are placed in a solution that is the same as the buffer used in Reaction 1 
regarding pH, enzyme stabilizers (e.g., bovine serum albumin, nonionic detergents, 

20 gelatin) and reducing agents (e.g., dithiothreitol, 2-mercaptoethanol) but that 

replaces any monovalent cation salt with 20 mM KC1; 20 mM KC1 is the 
demonstrated optimum for primer-independent cleavage. Buffers for enzymes, such 
as DNAPEcl, that usually operate in the absence of salt are not supplemented to 
achieve this concentration. To test for primer-independent cleavage (Reaction 3) 

25 the same quantity of the test molecule, but no primer, are combined under the same 

buffer conditions used for Reaction 2. 

All three test reactions are then exposed to enough of the enzyme that the 
molar ratio of enzyme to test complex is approximately 1:1. The reactions are 
incubated at a range of temperatures up to, but not exceeding, the temperature 

30 allowed by either the enzyme stability or the complex stability, whichever is lower, 

up to 80°C for enzymes from thermophiles, for a time sufficient to allow cleavage 
(10 to 60 minutes). The products of Reactions 1, 2 and 3 are resolved by 
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denaturing polyacrylamide gel electrophoresis, and visualized by autoradiography or 
by a comparable method appropriate to the labeling system used. Additional 
labeling systems include chemiluminescence detection, silver or other stains, 
blotting and probing and the like. The presence of cleavage products is indicated 
5 by the presence of molecules which migrate at a lower molecular weight than does 

the uncleaved test structure. These cleavage products indicate that the candidate 
polymerase has structure- specific 5' nuclease activity. 

To determine whether a modified DNA polymerase has substantially the 
same 5' nuclease activity as that of the native DNA polymerase, the results of the 

1 0 above-described tests are compared with the results obtained from these tests 

performed with the native DNA polymerase. By "substantially the same 5' 
nuclease activity" we mean that the modified polymerase and the native polymerase 
will both cleave test molecules in the same manner . It is not necessary that the 
modified polymerase cleave at the same rate as the native DNA polymerase. 

1 5 Some enzymes or enzyme preparations may have other associated or 

contaminating activities that may be functional under the cleavage conditions 
described above and that may interfere with 5' nuclease detection. Reaction 
conditions can be modified in consideration of these other activities, to avoid 
destruction of the substrate, or other masking of the 5' nuclease cleavage and its 

20 products. For example, the DNA polymerase I of E. coli (Pol I), in addition to its 

polymerase and 5' nuclease activities, has a 3' exonuclease that can degrade DNA 
in a 3' to 5' direction. Consequently, when the molecule in Fig. 16E is exposed to 
this polymerase under the conditions described above, the 3' exonuclease quickly 
removes the unpaired 3' arm, destroying the bifurcated structure required of a 

25 substrate for the 5' exonuclease cleavage and no cleavage is detected. The true 

ability of Pol I to cleave the structure can be revealed if the 3' exonuclease is 
inhibited by a change of conditions (e.g., pH), mutation, or by addition of a 
competitor for the activity. Addition of 500 pmoles of a single-stranded competitor 
oligonucleotide, unrelated to the Fig. 16E structure, to the cleavage reaction with 

30 Pol I effectively inhibits the digestion of the 3' arm of the Fig. 16E structure 

without interfering with the 5' exonuclease release of the 5' arm. The 
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concentration of the competitor is not critical, but should be high enough to occupy 
the 3' exonuclease for the duration of the reaction. 

Similar destruction of the test molecule may be caused by contaminants in 
the candidate polymerase preparation. Several sets of the structure specific 
5 nuclease reactions may be performed to determine the purity of the candidate 

nuclease and to find the window between under and over exposure of the test 
molecule to the polymerase preparation being investigated. 

The above described modified polymerases were tested for 5' nuclease 
activity as follows: Reaction 1 was performed in a buffer of 10 mM Tris-Cl, pH 
10 8.5 at 20°C, 1.5 mM MgCl 2 and 50 mM KC1 and in Reaction 2 the KC1 

concentration was reduced to 20 mM. In Reactions 1 and 2, 10 fmoles of the test 
substrate molecule shown in Fig. 16E were combined with 1 pmole of the indicated 
primer and 0.5 to 1.0 jul of extract containing the modified polymerase (prepared as 
described above). This mixture was then incubated for 10 minutes at 55°C. For 
1 5 all of the mutant polymerases tested these conditions were sufficient to give 

complete cleavage. When the molecule shown in Fig. 16E was labeled at the 5' 
end, the released 5' fragment, 25 nucleotides long, was conveniently resolved on a 
20% poly aery lamide gel (19:1 cross-linked) with 7 M urea in a buffer containing 
45 mM Tris-borate pH 8.3, 1.4 mM EDTA. Clones 4C-F and 5B exhibited 
20 structure-specific cleavage comparable to that of the unmodified DNA polymerase. 

Additionally, clones 4E, 4F and 4G have the added ability to cleave DNA in the 
absence of a 3 9 arm as discussed above. Representative cleavage reactions are 
shown in Figure 17. 

For the reactions shown in Fig. 17, the mutant polymerase clones 4E (Tag 
25 mutant) and 5B (Tfl mutant) were examined for their ability to cleave the hairpin 

substrate molecule shown in Fig. 16E. The substrate molecule was labeled at the 
5' terminus with 32 P. 10 fmoles of heat-denatured, end-labeled substrate DNA and 
0.5 units of DNAPTaq (lane 1) or 0.5 pi of 4e or 5b extract (Fig. 17, lanes 2-7, 
extract was prepared as described above) were mixed together in a buffer 
30 containing 10 mM Tris-Cl, pH 8.5, 50 mM KC1 and 1.5 mM MgCl 2 . The final 

reaction volume was 10 pi Reactions shown in lanes 4 and 7 contain in addition 
50 pM of each dNTP. Reactions shown in lanes 3, 4, 6 and 7 contain 0.2 jiM of 
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the primer oligonucleotide (complementary to the 3' arm of the substrate and 
shown in Fig. 16E). Reactions were incubated at 55° C for 4 minutes. Reactions 
were stopped by the addition of 8 |Ltl of 95% formamide containing 20 mM EDTA 
and 0.05% marker dyes per 10 jal reaction volume. Samples were then applied to 
5 12% denaturing acrylamide gels. Following electrophoresis, the gels were 

audoradiographed. Fig. 17 shows that clones 4E and 5B exhibit cleavage activity 
similar to that of the native DNAPTaq. Note that some cleavage occurs in these 
reactions in the absence of the primer. When long hairpin structure, such as the 
one used here (Fig. 1 6E), are used in cleavage reactions performed in buffers 
10 containing 50 mM KC1 a low level of primer-independent cleavage is seen. Higher 

concentrations of KC1 suppress, but do not elminate, this primer-independent 
cleavage under these conditions. 



2. Assay For Synthetic Activity 

The ability of the modified enzyme or proteolytic fragments is assayed by 
1 5 adding the modified enzyme to an assay system in which a primer is annealed to a 

template and DNA synthesis is catalyzed by the added enzyme. Many standard 
laboratory techniques employ such an assay. For example, nick translation and 
enzymatic sequencing involve extension of a primer along a DNA template by a 
polymerase molecule. 

20 In a preferred assay for determining the synthetic activity of a modified 

enzyme an oligonucleotide primer is annealed to a single-stranded DNA template, 
e.g., bacteriophage Ml 3 DNA, and the primer/template duplex is incubated in the 
presence of the modified polymerase in question, deoxynucleoside triphosphates 
(dNTPs) and the buffer and salts known to be appropriate for the unmodified or 

25 native enzyme. Detection of either primer extension (by denaturing gel 

electrophoresis) or dNTP incorporation (by acid precipitation or chromatography) is 
indicative of an active polymerase. A label, either isotopic or non-isotopic, is 
preferably included on either the primer or as a dNTP to facilitate detection of 
polymerization products. Synthetic activity is quantified as the amount of free 

30 nucleotide incorporated into the growing DNA chain and is expressed as amount 

incorporated per unit of time under specific reaction conditions. 
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Representative results of an assay for synthetic activity is shown in Fig. 18. 
The synthetic activity of the mutant DNAPTag clones 4B-F was tested as follows: 
A master mixture of the following buffer was made: 1.2X PCR buffer (IX PCR 
buffer contains 50 mM KC1, L5 mM MgCl 2 , 10 mM Tris-Cl, ph 8.5 and 0.05% 
5 each Tween 20 and Nonidet P40), 50 jiM each of dGTP, dATP and dTTP, 5 |^M 

dCTP and 0.125 ^iM a- 32 P-dCTP at 600 Ci/mmol. Before adjusting this mixture to 
its final volume, it was divided into two equal aliquots. One received distilled 
water up to a volume of 50 jal to give the concentrations above. The other 
received 5 fag of single-stranded M13mpl8 DNA (approximately 2.5 pmol or 0.05 
10 jaM final concentration) and 250 pmol of Ml 3 sequencing primer (5 |tiM final 

concentration) and distilled water to a final volume of 50 Each cocktail was 
warmed to 75°C for 5 minutes and then cooled to room temperature. This allowed 
the primers to anneal to the DNA in the DNA-containing mixtures. 

For each assay, 4 jul of the cocktail with the DNA was combined with 1 jil 
15 of the mutant polymerase, prepared as described, or 1 unit of UNAPTag (Perkin 

Elmer) in 1 jal of dH 2 0. A "no DNA" control was done in the presence of the 
DNAP7a# (Fig. 18, lane 1), and a "no enzyme" control was done using water in 
place of the enzyme (lane 2). Each reaction was mixed, then incubated at room 
temperature (approx. 22°C) for 5 minutes, then at 55°C for 2 minutes, then at 72°C 
20 for 2 minutes. This step incubation was done to detect polymerization in any 

mutants that might have optimal temperatures lower than 72°C After the final 
incubation, the tubes were spun briefly to collect any condensation and were placed 
on ice. One jal of each reaction was spotted at an origin 1.5 cm from the bottom 
edge of a polyethyleneimine (PEI) cellulose thin layer chromatography plate and 
25 allowed to dry. The chromatography plate was run in 0.75 M NaH 2 P0 4 , pH 3.5, 

until the buffer front had run approximately 9 cm from the origin. The plate was 
dried, wrapped in plastic wrap, marked with luminescent ink, and exposed to X-ray 
film. Incorporation was detected as counts that stuck where originally spotted, 
while the unincorporated nucleotides were carried by the salt solution from the 
30 origin. 

Comparison of the locations of the counts with the two control lanes 
confirmed the lack of polymerization activity in the mutant preparations. Among 
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the modified DNAPTag clones, only clone 4B retains any residual synthetic activity 
as shown in Fig. 18. 

EXAMPLE 3 

5' Nucleases Derived From Thermostable DNA Polymerases 
5 Can Cleave Short Hairpin Structures With Specificity 



The ability of the 5' nucleases to cleave hairpin structures to generate a 
cleaved hairpin structure suitable as a detection molecule was examined. The 
structure and sequence of the hairpin test molecule is shown in Fig. 19A (SEQ ID 
NO:15). The oligonucleotide (labeled "primer" in Fig. 19A, SEQ ID NO:22) is 

10 shown annealed to its complementary sequence on the 3 ' arm of the hairpin test 

molecule. The hairpin test molecule was single-end labeled with 32 P using a 
labeled T7 promoter primer in a polymerase chain reaction. The label is present on 
the 5' arm of the hairpin test molecule and is represented by the star in Fig. 19A. 
The cleavage reaction was performed by adding 1 0 fmoles of heat- 

15 denatured, end-labeled hairpin test molecule, 0.2uM of the primer oligonucleotide 

(complementary to the 3' arm of the hairpin), 50 jaM of each dNTP and 0.5 units 
of DNAPTag (Perkin Elmer) or 0.5 jal of extract containing a 5' nuclease (prepared 
as described above) in a total volume of 1 0 jj.1 in a buffer containing 1 0 mM Tris- 
Cl, pH 8.5, 50 mM KC1 and 1.5 mM MgCl 2 . Reactions shown in lanes 3, 5 and 7 

20 were run in the absence of dNTPs. 

Reactions were incubated at 55° C for 4 minutes. Reactions were stopped 
at 55° C by the addition of 8 jxl of 95% formamide with 20 mM EDTA and 0.05% 
marker dyes per 10 jal reaction volume. Samples were not heated before loading 
onto denaturing polyacrylamide gels (10% polyacrylamide, 19:1 crosslinking, 7 M 

25 urea, 89 mM Tris-borate, pH 8.3, 2.8 mM EDTA). The samples were not heated 

to allow for the resolution of single-stranded and re-duplexed uncleaved hairpin 
molecules. 

Fig. 19B shows that altered polymerases lacking any detectable synthetic 
activity cleave a hairpin structure when an oligonucleotide is annealed to the single- 
30 stranded 3' arm of the hairpin to yield a single species of cleaved product (Fig. 
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19B, lanes 3 and 4). 5' nucleases, such as clone 4D, shown in lanes 3 and 4, 
produce a single cleaved product even in the presence of dNTPs. 5 ' nucleases 
which retain a residual amount of synthetic activity (less than 1% of wild type 
activity) produce multiple cleavage products as the polymerase can extend the 
5 oligonucleotide annealed to the 3 9 arm of the hairpin thereby moving the site of 

cleavage (clone 4B, lanes 5 and 6). Native DNATaq produces even more species 
of cleavage products than do mutant polymerases retaining residual synthetic 
activity and additionally converts the hairpin structure to a double-stranded form in 
the presence of dNTPs due to the high level of synthetic activity in the native 
10 polymerase (Fig. 19B, lane 8). 

EXAMPLE 4 
Test Of The Trigger/Detection Assay 

To test the ability of an oligonucleotide of the type released in the trigger 
reaction of the trigger/detection assay to be detected in the detection reaction of the 

15 assay, the two hairpin structures shown in Fig. 20A were synthesized using 

standard techniques. The two hairpins are termed the A-hairpin (SEQ ID NO:23) 
and the T-hairpin (SEQ ID NO:24). The predicted sites of cleavage in the presence 
of the appropriate annealed primers are indicated by the arrows. The A- and T- 
hairpins were designed to prevent intra-strand mis-folding by omitting most of the 

20 T residues in the A-hairpin and omitting most of the A residues in the T-hairpin. 

To avoid mis-priming and slippage, the hairpins were designed with local variations 
in the sequence motifs (e.g., spacing T residues one or two nucleotides apart or in 
pairs). The A- and T-hairpins can be annealed together to form a duplex which has 
appropriate ends for directional cloning in pUC-type vectors; restriction sites are 

25 located in the loop regions of the duplex and can be used to elongate the stem 

regions if desired. 

The sequence of the test trigger oligonucleotide is shown in Fig. 20B; this 
oligonucleotide is termed the alpha primer (SEQ ID NO:25). The alpha primer is 
complementary to the 3' arm of the T-hairpin as shown in Fig. 20A. When the 
30 alpha primer is annealed to the T-hairpin, a cleavage structure is formed that is 
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recognized by thermostable DNA polymerases. Cleavage of the T-hairpin liberates 
the 5' single-stranded arm of the T-hairpin 5 generating the tau primer (SEQ ID 
NO:26) and a cleaved T-hairpin (Fig. 20B; SEQ ID NO:27). The tau primer is 
complementary to the 3' arm of the A-hairpin as shown in Fig. 20A. Annealing of 
5 the tau primer to the A-hairpin generates another cleavage structure; cleavage of 

this second cleavage structure liberates the 5' single-stranded arm of the A-hairpin, 
generating another molecule of the alpha primer which then is annealed to another 
molecule of the T-hairpin. Thermocycling releases the primers so they can 
function in additional cleavage reactions. Multiple cycles of annealing and 

10 cleavage are carried out. The products of the cleavage reactions are primers and 

the shortened hairpin structures shown in Fig. 20C. The shortened or cleaved 
hairpin structures may be resolved from the uncleaved hairpins by electrophoresis 
on denaturing acrylamide gels. 

The annealing and cleavage reactions are carried as follows: In a 50 

15 reaction volume containing 10 raM Tris-Cl, pH 8.5, 1.0 MgCl 2 , 75 mM KC1, 1 

pmole of A-hairpin, 1 pmole T-hairpin, the alpha primer is added at equimolar 
amount relative to the hairpin structures (1 pmole) or at dilutions ranging from 10- 
to 10 6 -fold and 0.5 jil of extract containing a 5' nuclease (prepared as described 
above) are added. The predicted melting temperature for the alpha or trigger 

20 primer is 60°C in the above buffer. Annealing is performed just below this 

predicted melting temperature at 55°C. Using a Perkin Elmer DNA Thermal 
Cycler, the reactions are annealed at 55°C for 30 seconds. The temperature is then 
increased slowly over a five minute period to 72°C to allow for cleavage. After 
cleavage, the reactions are rapidly brought to 55°C (1?C per second) to allow 

25 another cycle of annealing to occur. A range of cycles are performed (20, 40 and 

60 cycles) and the reaction products are analyzed at each of these number of 
cycles. The number of cycles which indicates that the accumulation of cleaved 
hairpin products has not reached a plateau is then used for subsequent 
determinations when it is desirable to obtain a quantitative result. 

30 Following the desired number of cycles, the reactions are stopped at 55°C 

by the addition of 8 fal of 95% formamide with 20 mM EDTA and 0.05% marker 
dyes per 10 jal reaction volume. Samples are not heated before loading onto 
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denaturing polyacrylamide gels (10% polyacrylamide, 19:1 crosslinking, 7 M urea, 
89 mM tris-borate, pH 8.3, 2.8 mM EDTA). The samples were not heated to allow 
for the resolution of single-stranded and re-duplexed uncleaved hairpin molecules. 
The hairpin molecules may be attached to separate solid support molecules, 
5 such as agarose, styrene or magnetic beads, via the 3' end of each hairpin. A 

spacer molecule may be placed between the 3 ' end of the hairpin and the bead if so 
desired. The advantage of attaching the hairpins to a solid support is that this 
prevents the hybridization of the A- and T-hairpins to one another during the cycles 
of melting and annealing. The A- and T-hairpins are complementary to one 
10 another (as shown in Fig. 20D) and if allowed to anneal to one another over their 

entire lengths this would reduce the amount of hairpins available for hybridization 
to the alpha and tau primers during the detection reaction. 

The 5' nucleases of the present invention are used in this assay because they 
lack significant synthetic activity. The lack of synthetic activity results in the 
15 production of a single cleaved hairpin product (as shown in Fig. 19B, lane 4). 

Multiple cleavage products may be generated by 1) the presence of interfering 
synthetic activity (see Fig. 19B, lanes 6 and 8) or 2) the presence of primer- 
independent cleavage in the reaction. The presence of primer-independent cleavage 
is detected in the trigger/detection assay by the presence of different sized products 
20 at the fork of the cleavage structure. Primer-independent cleavage can be 

dampened or repressed, when present, by the use of uncleavable nucleotides in the 
fork region of the hairpin molecule. For example, thiolated nucleotides can be 
used to replace several nucleotides at the fork region to prevent primer-independent 
cleavage. 

25 EXAMPLE 5 

Cleavage Of Linear Nucleic Acid Substrates 

From the above, it should be clear that native (i.e., "wild type") 
thermostable DNA polymerases are capable of cleaving hairpin structures in a 
specific manner and that this discovery can be applied with success to a detection 
30 assay. In this example, the mutant DNAPs of the present invention are tested 
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against three different cleavage structures shown in Figure 22 A. Structure 1 in 
Figure 22A is simply single stranded 206-mer (the preparation and sequence 
information for which was discussed above). Structures 2 and 3 are duplexes; 
structure 2 is the same hairpin structure as shown in Figure 12A (bottom), while 
5 structure 3 has the hairpin portion of stucture 2 removed. 

The cleavage reactions comprised 0.01 pmoles of the resulting substrate 
DNA, and 1 pmole of pilot oligonucleotide in a total volume of 1 0 jal of 1 0 mM 
Tris-Cl 5 pH 8.3, 100 mM KC1, 1 mM MgCl 2 . Reactions were incubated for 30 
minutes at 55°C, and stopped by the addition of 8 \xl of 95% formamide with 20 

10 mM EDTA and 0.05% marker dyes. Samples were heated to 75°C for 2 minutes 

immediately before electrophoresis through a 10% polyacrylamide gel (19:1 cross 
link), with 7M urea, in a buffer of 45 mM Tris-Borate, pH 8.3, 1.4 mM EDTA. 

The results were visualized by autoradiography and are shown in Figure 
22B with the enzymes indicated as follows: I is native Taq DNAP; II is native Tfl 

15 DNAP; III is Cleavase™ BX shown in Figure 4E; IV is Cleavase™ BB shown in 

Figure 4F; V is the mutant shown in Figure 5B; and VI is Cleavase™ BN shown 
in Figure 4G. 

Structure 2 was used to "normalize" the comparison. For example, it was 
found that it took 50 ng of Taq DNAP and 300 ng of Cleavase™ BN to give 
20 similar amounts of cleavage of Structure 2 in thirty (30) minutes. Under these 

conditions native Taq DNAP is unable to cleave Structure 3 to any significant 
degree. Native Tfl DNAP cleaves Structure 3 in a manner that creates multiple 
products. 

By contrast, all of the mutants tested cleave the linear duplex of Structure 3 . 
25 This finding indicates that this characteristic of the mutant DNA polymerases is 

consistent of thermostable polymerases across thermophilic species. 

The finding described herein that the mutant DNA polymerases of the 
present invention are capable of cleaving linear duplex structures allows for 
application to a more straightforward assay design (Figure 1A). Figure 23 provides 
30 a more detailed schematic corresponding to the assay design of Figure 1 A. 

The two 43-mers depicted in Figure 23 were synthesized by standard 
methods. Each included a fluorescein on the 5'end for detection purposes and a 
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biotin on the 3' end to allow attachment to streptavidin coated paramagnetic 
particles (the biotin-avidin attachment is indicated by "vsrs). 

Before the trityl groups were removed, the oligos were purified by HPLC to 
remove truncated by-products of the synthesis reaction. Aliquots of each 43-mer 
5 were bound to M-280 Dynabeads (Dynal) at a density of 100 pmoles per mg of 

beads. Two (2) mgs of beads (200 jul) were washed twice in IX wash/bind buffer 
(1 M NaCl, 5 mM Tris-Cl, pH 7.5, 0.5 mM EDTA) with 0.1% BSA, 200 jul per 
wash. The beads were magnetically sedimented between washes to allow 
supernatant removal. After the second wash, the beads were resuspended in 200 jlxI 
10 of 2X wash/bind buffer (2 M Na CI, 10 mM Tris-Cl, pH 7.5 with 1 mM EDTA), 

and divided into two 100 (al aliquots. Each aliquot received 1 jlxI of a 100 |LiM 
solution of one of the two oligonucleotides. After mixing, the beads were 
incubated at room temperature for 60 minutes with occasional gentle mixing. The 
beads were then sedimented and analysis of the supernatants showed only trace 
1 5 amounts of unbound oligonucleotide, indicating successful binding. Each aliquot of 

beads was washed three times, 100 jlxI per wash, with IX wash/bind buffer, then 
twice in a buffer of 10 mM Tris-Cl, pH 8.3 and 75 mM KC1. The beads were 
resuspended in a final volume of 1 00 jal of the Tris/KCl, for a concentration of 1 
pmole of oligo bound to 1 0 jig of beads per jal of suspension. The beads were 
20 stored at 4°C between uses. 

The types of beads correspond to Figure 1 A. That is to say, type 2 beads 
contain the oligo (SEQ ID NO:33) comprising the complementary sequence (SEQ 
ID NO:34) for the alpha signal oligo (SEQ ID NO:35) as well as the beta signal 
oligo (SEQ ID NO:36) which when liberated is a 24-mer. This oligo has no "As" 
25 and is "T" rich. Type 3 beads contain the oligo (SEQ ID NO:37) comprising the 

complementary sequence (SEQ ID NO:38) for the beta signal oligo (SEQ ID 
NO:39) as well as the alpha signal oligo (SEQ ID NO:35) which when liberated is 
a 20-mer. This oligo has no "Ts" and is "A" rich. 

Cleavage reactions comprised 1 jul of the indicated beads, 10 pmoles of 
30 unlabelled alpha signal oligo as "pilot" (if indicated) and 500 ng of Cleavase™ BN 

in 20 |il of 75 mM KC1, 10 mM Tris-Cl, pH 8.3, 1.5 mM MgCl 2 and 10 jaM 
CTAB. All components except the enzyme were assembled, overlaid with light 
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mineral oil and warmed to 53 °C. The reactions were initiated by the addition of 
prewarmed enzyme and incubated at that temperature for 30 minutes. Reactions 
were stopped at temperature by the addition of 16 jal of 95% formamide with 20 
mM EDTA and 0.05% each of bromophenol blue and xylene cyanol. This addition 
5 stops the enzyme activity and, upon heating, disrupts the biotin-avidin link, 

releasing the majority (greater than 95%) of the oligos from the beads. Samples 
were heated to 75 °C for 2 minutes immediately before electrophoresis through a 
10% polyacrylamide gel (19:1 cross link), with 7 M urea, in a buffer of 45 mM 
Tris-Borate, pH 8.3, 1.4 mM EDTA. Results were visualized by contact transfer of 
1 0 the resolved DNA to positively charged nylon membrane and probing of the 

blocked membrane with an anti-fluorescein antibody conjugated to alkaline 
phosphatase. After washing, the signal was developed by incubating the membrane 
in Western Blue (Promega) which deposits a purple precipitate where the antibody 
is bound. 

1 5 Figure 24 shows the propagation of cleavage of the linear duplex nucleic 

acid structures of Figure 23 by the DNAP mutants of the present invention. The 
two center lanes contain both types of beads. As noted above, the beta signal oligo 
(SEQ ID NO:36) when liberated is a 24-mer and the alpha signal oligo (SEQ ID 
NO:35) when liberated is a 20-mer. The formation of the two lower bands 

20 corresponding to the 24-mer and 20-mer is clearly dependent on "pilot". 

EXAMPLE 6 

5' Exonucleolytic Cleavage ("Nibbling") By Thermostable DNAPs 

It has been found that thermostable DNAPs, including those of the present 
invention, have a true 5' exonuclease capable of nibbling the 5' end of a linear 

25 duplex nucleic acid structures. In this example, the 206 base pair DNA duplex 

substrate is again employed (see above). In this case, it was produced by the use 
of one P-labeled primer and one unlabeled primer in a polymerase chain reaction. 
The cleavage reactions comprised 0.01 pmoles of heat-denatured, end-labeled 
substrate DNA (with the unlabeled strand also present), 5 pmoles of pilot 

30 oligonucleotide (see pilot oligos in Figure 12A) and 0.5 units of DNAPTaq or 0.5 
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fj. of Cleavase™ BB in the E. coli extract (see above), in a total volume of 1 0 \x\ of 
10 mM Tris-Cl, pH 8.5, 50 mM KC1, 1.5 mM MgCl 2 . 

Reactions were initiated at 65°C by the addition of pre-warmed enzyme, 
then shifted to the final incubation temperature for 30 minutes. The results are 
shown in Figure 25 A. Samples in lanes 1-4 are the results with native Taq DNAP, 
while lanes 5-8 shown the results with Cleavase™ BB. The reactions for lanes 1, 
2, 5, and 6 were performed at 65°C and reactions for lanes 3, 4, 7, and 8 were 
performed at 50°C and all were stopped at temperature by the addition of 8 jal of 
95% formamide with 20 mM EDTA and 0.05% marker dyes. Samples were heated 
to 75°C for 2 minutes immediately before electrophoresis through a 10% 
acrylamide gel (19:1 cross-linked), with 7 M urea, in a buffer of 45 mM 
Tris^Borate, pH 8.3, 1.4 mM EDTA. The expected product in reactions 1, 2, 5, 
and 6 is 85 nucleotides long; in reactions 3 and 7, the expected product is 27 
nucleotides long. Reactions 4 and 8 were performed without pilot, and should 
remain at 206 nucleotides. The faint band seen at 24 nucleotides is residual end- 
labeled primer from the PCR. 

The surprising result is that Cleavase™ BB under these conditions causes all 
of the label to appear in a very small species, suggesting the possibility that the 
enzyme completely hydrolyzed the substrate. To determine the composition of the 
fastest-migrating band seen in lanes 5-8 (reactions performed with the deletion 
mutant), samples of the 206 base pair duplex were treated with either T7 gene 6 
exonuclease (USB) or with calf intestine alkaline phosphatase (Promega), according 
to manufacturers' instructions, to produce either labeled mononucleotide (lane a of 
Figure 25B) or free 32 P-labeled inorganic phosphate (lane b of Figure 25B), 
respectively. These products, along with the products seen in lane 7 of panel A 
were resolved by brief electrophoresis through a 20% acrylamide gel (19:1 cross- 
link), with 7 M urea, in a buffer of 45 mM Tris*Borate, pH 8.3, 1.4 mM EDTA. 
Cleavase™ BB is thus capable of converting the substrate to mononucleotides. 
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EXAMPLE 7 
Nibbling Is Duplex Dependent 

The nibbling by Cleavase™ BB is duplex dependent. In this example, 
internally labeled, single strands of the 206-mer were produced by 15 cycles of 
5 primer extension incorporating ot- 32 P labeled dCTP combined with all four 

unlabeled dNTPs, using an unlabeled 206-bp fragment as a template. Single and 
double stranded products were resolved by electrophoresis through a non-denaturing 
6% polyacrylamide gel (29:1 cross-link) in a buffer of 45 mM Tris»Borate, pH 8.3, 
1 A mM EDTA, visualized by autoradiography, excised from the gel, eluted by 

10 passive diffusion, and concentrated by ethanol precipitation. 

The cleavage reactions comprised 0.04 pmoles of substrate DNA, and 2 jal 
of Cleavase™ BB (in an E. coli extract as described above) in a total volume of 
40 ^1 of 10 mM Tris*Cl, pH 8.5, 50 mM KC1, 1.5 mM MgCl 2 . Reactions were 
initiated by the addition of pre-warmed enzyme; 10 \x\ aliquots were removed at 5, 

15 10, 20, and 30 minutes, and transferred to prepared tubes containing 8 |il of 95% 

formamide with 30 mM EDTA and 0.05% marker dyes. Samples were heated to 
75 °C for 2 minutes immediately before electrophoresis through a 10% acrylamide 
gel (19:1 cross-linked), with 7 M urea, in a buffer of 45 mM Tris^Borate, pH 8.3, 
1.4 mM EDTA. Results were visualized by autoradiography as shown in Figure 
20 26. Clearly, the cleavage by Cleavase™ BB depends on a duplex structure; no 

cleavage of the single strand structure is detected whereas cleavage of the 206-mer 
duplex is complete. 

EXAMPLE 8 
Nibbling Can Be Target Directed 

25 The nibbling activity of the DNAPs of the present invention can be 

employed with success in a detection assay. One embodiment of such an assay is 
shown in Figure 27. In this assay, a labelled oligo is employed that is specific for 
a target sequence. The oligo is in excess of the target so that hybridization is 
rapid. In this embodiment, the oligo contains two fluorescein labels whose 



-72- 



WO 94/29482 



PCT/US94/06253 



proximity on the oligo causes their emmision to be quenched. When the DNAP is 
permitted to nibble the oligo the labels separate and are detectable. The shortened 
duplex is destabilized and disassociates. Importantly, the target is now free to react 
with an intact labelled oligo. The reaction can continue until the desired level of 
5 detection is achieved. An analogous, although different, type of cycling assay has 

been described employing lambda exonuclease. See C.G. Copley and C. Boot, 
BioTechniques 13:888 (1992). 

The success of such an assay depends on specificity. In other words, the 
oligo must hybridize to the specific target. It is also preferred that the assay be 
10 sensitive; the oligo ideally should be able to detect small amounts of target. 

Figure 28 A shows a 5 '-end 32 P-labelled primer bound to a plasmid target sequence. 
In this case, the plasmid was pUC19 (commercially available) which was heat 
denatured by boiling two (2) minutes and then quick chilling. The primer is a 21- 
mer (SEQ ID NO:39). The enzyme employed was Cleavase™ BX (a dilution 
15 equivalent to 5 x 10" 3 ul extract) in 100 mM KC1, 10 mM Tris-Cl, pH 8.3, 2 mM 

MnCl 2 . The reaction was performed at 55°C for sixteen (16) hours with or without 
genomic background DNA (from chicken blood). The reaction was stopped by the 
addition of 8 ul of 95% formamide with 20 mM EDTA and marker dyes. 

The products of the reaction were resolved by PAGE (10% polyacrylamide, 
20 19:1 cross link, 1 x TBE) as seen in Figure 28B. Lane "M" contains the labelled 

21-mer. Lanes 1-3 contain no specific target, although Lanes 2 and 3 contain 100 
ng and 200 ng of genomic DNA, respectively. Lanes 4, 5 and 6 all contain 
specific target with either 0 ng, 100 ng or 200 ng of genomic DNA, respectively. 
It is clear that conversion to mononucleotides occurs in Lanes 4, 5 and 6 regardless 
25 of the presence or amount of background DNA. Thus, the nibbling can be target 

directed and specific. 
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EXAMPLE 9 
Cleavase Purification 

As noted above, expressed thermostable proteins, i.e. 9 the 5' nucleases, were 
isolated by crude bacterial cell extracts. The precipitated E. colt proteins were 
5 then, along with other cell debris, removed by centrifugation. In this example, 

cells expressing the BN clone were cultured and collected (500 grams). For each 
gram (wet weight) of E. coli, 3ml of lysis buffer (50 mM Tris-HCl, pH 8.0, 1 mM 
EDTA, lOOjaM NaCl) was added. The cells were lysed with 200 ug/ml lysozyme 
at room temperature for 20 minutes. Thereafter deoxycholic acid was added to 
10 make a 0.2% final concentration and the mixture was incubated 15 minutes at room 

temperature. 

The lysate was sonicated for approximately 6-8 minutes at 0°C. The 
precipitate was removed by centriguation (39,000g for 20 minutes). 
Polyethyleneimine was added (0.5%) to the supernatant and the mixture was 

15 incubated on ice for 15 minutes. 

The mixture was centrifuged (5,000g for 15 minutes) and the supernatant was 
retained. This was heated for 30 minutes at 60°C and then centrifuged again 
(5,000g for 15 minutes) and the supernatant was again retained. 

The supernatant was precipitated with 35% ammonium sulfate at 4°C for 15 

20 minutes. The mixture was then centrifuged (5,000g for 15 minutes) and the 

supernatant was removed. The precipitate was then dissolved in 0.25 M KC1, 20 
Tris pH 7.6, 0.2% Tween and 0.1 EDTA) and then dialyzed against Binding 
Buffer (8X Binding Buffer comprises: 40mM imidazole, 4M NaCl, 160 mM Tris- 
HCl, pH 7.9). 

25 The solubilized protein is then purified on the Ni ++ column (Novagen). The 

Binding Buffer is allows to drain to the top of the column bed and load the column 
with the prepared extract. A flow rate of about 10 column volumes per hour is 
optimal for efficient purification. If the flow rate is too fast, more impurities will 
contaminate the eluted fraction. 

30 The column is washed with 25 ml (10 volumes) of IX Binding Buffer and 

then washed with 15 ml (6 volumes) of IX Wash Buffer (8X Wash Buffer 
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comprises: 480mM imidazole, 4M NaCl, 160 mM Tris-HCl, pH 7.9). The bound 
protein was eluted with 15ml (6 volumes) of IX Elute Buffer (4X Elute Buffer 
comprises: 4mM imidazole, 2M NaCl, 80 mM Tris-HCl, pH 7.9). Protein is then 
reprecipitated with 35% Ammonium Sulfate as above. The precipitate was then 
5 dissolved and dialyzed against: 20 mM Tris, 100 mM KC1, ImM EDTA). The 

solution was brought up to 0.1% each of Tween 20 and NP-40 and stored at 4°C. 

From the above, it should be clear that the present invention provides novel 
cleaving enzymes having heretofore undisclosed nuclease activities. The enzymes 
can be employed with success in target detection assays of various designs. These 
1 0 assays do not require that the sample DNA be amplified prior to detection and 

therefore offer an improvement in DNA-based detection technology. 
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(B) STREET: 220 Montgomery Street, Suite 2200 
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(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 
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(C) REFERENCE /DOCKET NUMBER : FORS- 01000 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 05 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATGAGGGGGA TGCTGCCCCT CTTTGAGCCC AAGGGCCGGG TCCTCCTGGT GGACGGCCAC 6 0 

CACCTGGCCT ACCGCACCTT CCACGCCCTG AAGGGCCTCA CCACCAGCCG GGGGGAGCCG 12 0 

GTGCAGGCGG TCTACGGCTT CGCCAAGAGC CTCCTCAAGG CCCTCAAGGA GG AC GGGGAC 180 
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GCGGTGATCG 


f TGGTCTTTGA 


. CGCCAAGGCC 


1 CCCTCCTTCC 


1 GPPAPnnr3r:p 




24 0 


TACAAGGCGG 


GCCGGGCCCC 


CACGCCGGAG 


GACTTTCCCC 


1 GGCAACTPRf 




o n r\ 


GAGCTGGTGG 


ACCTCCTGGG 


GCTGGCGCGC 


CTCGAGGTCC 


CGGClPTAPf^a 


vj (jj L. \j AL Av_ 


3 6 0 


GTCCTGGCCA 


GCCTGGCCAA 


GAAGGCGGAA 


AAGGAGGG CT 


APGAG^JTPPn 

nv<u/iuvj x v_ vjr 




42 0 


GCCGACAAAG 


ACCTTTACCA 


GCTCCTTTCC 


GACCGCATCC 


APGTPPTPPZ1 


\_- V- C Vj/iVa Vj VaCj 


4 8 0 


TACCTCATCA 


CCCCGGCCTG 


GCTTTGGGAA 


AAGTACGGCC 






54 0 


GACTACCGGG 


CCCTGACCGG 


GGACGAGTCC 


GACAACCTTC 


CCGGGC5TP A 2\ 




rT r\ r\ 

6 0 0 


GAGAAGACGG 


CGAGGAAGCT 


TCTGGAGGAG 


TGGGGGAGCC 


TGGAAGGrPT 


PPTPLana a p 


e~ t~ c\ 
o o U 


CTGGACCGGC 


TGAAGCCCGC 


CATCCGGGAG 


AAGATCCTGG 


CCCACATGGA 


PGATPTf4AAr^ 


/ A V 


CTCTCCTGGG 


ACCTGGCCAA 


GGTGCGCACC 


GACCTGCCCC 


TGGAGGTGGA 


CTTCGCCAAA 


780 


AGGCGGGAGC 


CCGACCGGGA 


GAGGCTTAGG 


GCCTTTCTGG 


AGAGGCTTGA 


GTTTGGCAGC 


840 


CTCCTCCACG 


AGTTCGGCCT 


TCTGGAAAGC 


CCCAAGGCCC 


TGGAGGAGGC 


CCCCTGGCCC 


900 

^ V* *J 


CCGCCGGAAG 


GGGCCTTCGT 


GGGCTTTGTG 


CTTTCCCGCA 


AGGAGC C! CAT 


Vjf X vavao-V^V^Lj/iX 


y 6 U 


CTTCTGGCCC 


TGGCCGCCGC 


CAGGGGGGGC 




Vaovj v.- ^ ^ o J\ 


vj C, C IT AT AAA 


102 0 


GCCCTCAGGG 


ACCTGAAGGA 

* rife ^rf<4L M^W ^rf* A. 


GGCGCGGGGG 


HT"T even ecu 


AAoAL-C J. (jACj 


CGTTCTGGCC 


1080 


CTGAGGGAAG 


GCCTTGGCCT 


CCCGCCCGGC 


GACG AC C C CA 


X V- X V V L V 


jr~«rp 7\ pprnppmp 


114 0 


GACCCTTCCA 


ACACCACCCC 


CGAGGGGGTG 


GCCCGGCGCT 








GAGGCGGGGG 


AGCGGGCCGC 


CCTTTCCGAG 


AGGCTCTTCG 


CCAAC CTGTG 


\3\J\3\3J-i\3\3\^. X X 


IzbU 


GAGGGGGAGG 


AGAGGCTCCT 


TTGGCTTTAC 


CGGGAGGTGG 


AGAGGPPPPT 


X X C^lsl. XutiL 


1.32 0 


CTGGCCCACA 


TGGAGGCCAC 


GGGGGTGCGC 


CTGGACGTGG 


>— V— X J-L X V_ X V_^-i.Vj7 


X X X L-L- 


13 8 0 


CTGGAGGTGG 


CCGAGGAGAT 


CGCCCGCCTC 


GAGGCCGAGG 


x V_ X X \_ *wV3f X 




1440 


CCCTTCAACC 


TCAACTCCCG 


GGACCAGCTG 


GAAAGGGTCC 


TP t r r r ra a na'A 


VJfV. 1 /i.V700L« X 1 


150 0 


CCCGCCATCG 


GCAAGACGGA 


GAAGACCGGC 


AAGCGCTCPA 


P P 7\ ^Pf3 P Pf*2 r 1 


X v^L- 1 CsCjAvj 


156 0 


GCCCTCCGCG 


AGGCCCACCC 


CATCGTGGAG 


AAGATCCTGC 


ARTAPPnf3f3A 


i^PTPBPPS 7\ /~i 
V2r\_ X LiiL LAAb 


162 0 


CTGAAGAGCA 


CCTACATTGA 


CCCCTTGCCG 


GACCTCATCC 


APPPPAf^n&P 




16 8 0 


CACACCCGCT 


TCAACCAGAC 


GGCCACGGCC 


ACGGGCAGGC 


TAAGTAPPTP 


PlUTPPPTl 7\ C* 


± /4U 


CTCCAGAACA 


TCCCCGTCCG 


CACCCCGCTT 


GGG C AGAGG A 


TPPf3PPrZP^2P 


Lll LAI 


1800 


GAGGAGGGGT 


GGCTATTGGT 


GGCCCTGGAC 


TATAGCCAGA 


TAGfiGPTPnn 

X x-Jiw^~i.O <■ X 


bb X bL X (jlaL-L. 


1860 


CACCTCTCCG 


GCGACGAGAA 


CCTGATCCGG 


GTCTTCCAGG 


AGGGGPGGGA 






GAGACCGCCA 


GCTGGATGTT 


CGGCGTCCCC 


CGGGAGGCCG 


TGGACCCCCT 


GATGPGPPGHJ 




a p an p p an a a 




#*t *-t /-t rp /*t rp f-\ 


TACGGCATGT 


CGGCCCACCG 


CCTCTCCCAG 


2040 


GAGCTAGCCA 


TCCTTACGAG 


GAGGCCCAGG 


CCTTCATTGA 


GCGCTACTTT 


CAGAGCTTCC 


2100 


CCAAGGTGCG 


GGCCTGGATT 


GAGAAGACCC 


TGGAGGAGGG 


CAGGAGGCGG 


GGGTACGTGG 


2160 


AGACCCTCTT 


CGGCCGCCGC 


CGCTACGTGC 


CAGACCTAGA 


GGCCCGGGTG 


AAGAGCGTGC 


2220 
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GGGAGGCGGC CGAGCGCATG GCCTTCAACA TGCCCGTCCA GGGCACCGCC GCCGACCTCA 22 8 0 

TGAAGCTGGC TATGGTGAAG CTCTTCCCCA GGCTGGAGGA AATGGGGGCC AGGATGCTCC 2 34 0 

TTCAGGTCCA CGACGAGCTG GTCCTCGAGG CCCCAAAAGA GAGGGCGGAG GCCGTGGCCC 24 0 0 

GGCTGGCCAA GGAGGTCATG GAGGGGGTGT ATCCCCTGGC CGTGCCCCTG GAGGTGGAGG 24 6 0 

TGGGGATAGG GGAGGACTGG CTCTCCGCCA AGGAGTGATA CCACC 2 505 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



ATGGCGATGC 


TTCCCCTCTT 

BBB ^B» ^P* ^p" ■*» *ap 


TGAGCCCAAA 

J* ^b*bp afc^a* ^bf ^bb ^aPaa. bbbS bbbV a> 


GGCCGCGTGC 


TCCTGGTC4C?A 


nnac r* a pp a n 

wUUULnUwriU 


a n 
o u 


CTGGCCTACC 

^a* aa* ^afl ^aJ ^a* 1 ^a* afc ak Jbj^p* 1 


GCACCTTCTT 


TGCCCTCAAG 


GGCCTCAPPA 






inn 

±zu 


CAGGCGGTCT 


ACGGCTTCGC 


CAAAAGCCTC 

^a*atJMt M aba* • - ak ^- . - 


CTCAAGGCCC 


TG Ann A 




J.OU 


GTGGTGGTGG 

^b# bbb ^a* aaa ^fcrf ^taa» 


TCTTTGACG C 


CAAGGCCCCC 


TCCTTCCGCC 








AAGGCGGGCC 

» »■ ^a» ^bf ^a^ ^a» ^bjb* 


GGGCCCCCAC 

^a* ^aa* ^b* v# ^bbp vaBafc a» T**P 


CCCGGAGGAC 

^■b* ^a* ^pp ^af W * aV^ai ^af aV JW^bbp 


TTTCCCCGGC 


AGCTGGCPPT 




inn 


TTGGTGG AC C 

^B» ^BP- ^a"* ^™ ^P*-"" a" ^P* ^BB* 


TCCTAGGCCT 

«■>■ ^aav aaa. ah «»^Bjap ^aia# >a> ^pw aa» 


TGTGCGGCTG 

<aV ^P* *k ^af ^a* ^af ^B0 ^B« ala „ m 


GAGGTTCCCG 


GPTTTGAGGn 


GG A GG A GGTfS 


-3 O U 


CTGGCCACCC 


TGGCCAAGCG 


GGCGGAAAAG 


GAGGGGTACG 


AGGTGCGCAT 


GPTGAPTGPG 


TE ^ W 


GACCGCGACC 


TCTACCAGCT 


CCTTTCGGAG 


CGCATCGCCA 


TCCTCCACCC 


TGAGGGGT A P 


O U 


CTGATCACC C 


^**B) M ^pbb #h vaaaj ^-pm ^mm BBBaj 

CGGCGTGGCT 


BBJM *-fc ,-ttaj Aaj m aam oak 

TTACGAGAAG 


TACGGCCTGC 


GCCCGGAGCA 


GTGGGTGGAC 


540 


TACCGGGCCC 


TGGCGGGGGA 


CCCCTCGGAT 


AACATCCCCG 


GGGTGAAGGG 


CATCGGGGAG 


600 


AAGACCGCCC 


AGAGGCTCAT 


CCGCGAGTGG 


GGGAGCCTGG 


AAAACCTCTT 


CCAGCACCTG 


660 


GACCAGGTGA 


AGCCCTCCTT 


GCGGGAGAAG 


CTCCAGGCGG 


GCATGGAGGC 


CCTGGCCCTT 


720 


TCCCGGAAGC 


TTTCCCAGGT 


GCACACTGAC 


CTGCCCCTGG 


AGGTGGACTT 


CGGGAGGCGC 


780 


CGCACACCCA 


ACCTGGAGGG 


TCTGCGGGCT 


TTTTTGGAGC 


GGTTGGAGTT 


TGGAAGCCTC 


840 


CTCCACGAGT 


TCGGCCTCCT 


GGAGGGGCCG 


AAGGCGGCAG 


AGGAGGCCCC 


CTGGCCCCCT 


900 


CCGGAAGGGG 


CTTTTTTGGG 


CTTTTCCTTT 


TCCCGTCCCG 


AGCCCATGTG 


GGCCGAGCTT 


960 


CTGGCCCTGG 


CTGGGGCGTG 


GGAGGGGCGC 


CTCCATCGGG 


CACAAGACCC 


CCTTAGGGGC 


1020 


CTGAGGGACC 


TTAAGGGGGT 


GCGGGGAATC 


CTGGCCAAGG 


ACCTGGCGGT 


TTTGGCCCTG 


1080 


CGGGAGGGCC 


TGGACCTCTT 


CCCAGAGGAC 


GACCCCATGC 


TCCTGGCCTA 


CCTTCTGGAC 


1140 


CCCTCCAACA 


CCACCCCTGA 


GGGGGTGGCC 


CGGCGTTACG 


GGGGGGAGTG 


GACGGAGGAT 


1200 


GCGGGGGAGA 


GGGCCCTCCT 


GGCCGAGCGC 


CTCTTCCAGA 


CCCTAAAGGA 


GCGCCTTAAG 


1260 


GGAGAAGAAC 


GCCTGCTTTG 


GCTTTACGAG 


GAGGTGGAGA 


AGCCGCTTTC 


CCGGGTGTTG 


1320 


GCCCGGATGG 


AGGCCACGGG 


GGTCCGGCTG 


GACGTGGCCT 


ACCTCCAGGC 


CCTCTCCCTG 


1380 
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GAGGTGGAGG 


CGGAGGTGCG 


CCAGCTGGAG 


GAPtPtAPtPTPT 


T vc t c t nr t r ,r vr i r*r t 


r , r , pnr , 7\ r~* 

L. Vj^jj L. L. AL t_ L. C_ 


144 0 


TTCAACC TCA 


ACTCCCGCGA 


CCAGCTGGAG 


CGGGTGPTPT 


TTP APPZXPPT 1 

X X Vjl^i.^ \J-t\\J \ X 


(^if^lP pptp r^r^T 1 


lj UU 


GCCATCGGCA 


AGACGGAGAA 


GACGGGGAAA 


CGCTCCACCA 

N*p* \J \w> J- \ppP %pp#f^S^pp> \ m m£^. 


GPGPTPrPPPrT 


p. p TT^n i\ n c p 


IjuU 


CTGCGAGAGG 


CCCACCCCAT 


CGTGGACCGC 


ATCCTGCAGT 


ACCGGGAGPT 




i a d n 


AAGAACACCT 


ACATAGACCC 


CCTGCCCGCC 


CTGGTCCACC 




prrinPTPPs 


iron 


ACCCGCTTCA 


ACCAGACGGC 


CACCGCCACG 


GGCAGGCTTT 


PPAnPTPPf^ta 




i /4U 


CAGAACATCC 


CCGTGCGCAC 


CCCTCTGGGC 

^p* x — * p*V V>J SpJ VJ Vm. 


PAGPOPATPP 




ppmppppp»p 

ilj L- L- ACj 


n o r\ r\ 


GAGGGCTGGG 


TGCTGGTGGT 


CTTGGACTAC 


AGC CAGATTG 


J^\J ^ x X UuvJVJ X 


x Lj^iL-L-L-i-iL. 


i q c r\ 


CTCTCCGGGG 


ACGAGAACCT 


GATCCGGGTC 


TTTCAGGAGG 


GGAGGGACAT 


CCACACCCAG 


X> J 4-i \J 


ACCGCCAGCT 


GGATGTTCGG 


CGTTTCCCCC 


GAAGGGGTAG 


ACCCTCTGAT 


GCGCCGGGCG 


1980 


GCCAAGACCA 


TCAACTTCGG 


GGTGCTCTAC 


GGCATGTCCG 


CCCACCGCCT 


CTCCGGGGAG 


2040 


CTTTCCATCC 


CCTACGAGGA 


GGCGGTGGCC 


TTCATTGAGC 


GCTACTTCCA 

^p* W pIp pV ph^p# pppj p^ ^M>f^L> 


GAGCTAPPPP 


X u u 


AAGGTGCGGG 


CCTGGATTGA 


APPPTP 


ninri2i/t!rip.pp 


L- V„ VjVj t„ VjoCj 


G TATGTGGAG 


2160 


a pp p t ptt 1 p p. 




V — LAXul ^rL-L-L- 




CCCGGGTGAA 


GAGCGTGCGC 


2220 


GAGGPGGPGG 


a c? pp* pa Tan p 


V — XX L.rlriVvri. J. w 




\3 C AL L. C G C 


CGACCTCATG 


2280 


AAGCTGGCCA 


TGGTGCGGCT 


TTTCCCCCGG 


CTTCAGGAAC 


TGGGGGCGAG 


GATGCTTTTG 


2340 


CAGGTGCACG 


ACGAGCTGGT 


CCTCGAGGCC 


CCCAAGGACC 


GGG CGGAGAG 


GGTAGCCGCT 


2400 


TTGGCCAAGG 


AGGTCATGGA 


GGGGGTCTGG 


CCCCTGCAGG 


TGCCCCTGGA 


GGTGGAGGTG 


2460 


GGCCTGGGGG 


AGGACTGGCT 


CTCCGCCAAG 


GAGTAG 






2496 



(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 04 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGG CCGGG TCCTCCTGGT GGACGGCCAC 6 0 

CACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA CCACGAGCCG GGGCGAACCG 12 0 

GTGCAGGCGG TCTACGGCTT CGCCAAGAGC CTCCTCAAGG CCCTGAAGGA GGACGGGTAC 18 0 

AAGGCCGTCT TCGTGGTCTT TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAG 24 0 

GCCTACAAGG CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC 3 00 

AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA CGAGGCGGAC 36 0 

GACGTTCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG GGTACGAGGT GCGCATCCTC 42 0 

ACCGCCGACC GCGACCTCTA C.CAACTCGTC TCCGACCGCG TCGCCGTCCT CCACCCCGAG 48 0 

4 

GGCCACCTCA TCACCCCGGA GTGGCTTTGG GAGAAGTACG GCCTCAGGCC GGAGCAGTGG 540 
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GTGGACTTCC GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGT CAAGGGCATC 60 0 

GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA CCTCCTCAAG 66 0 

AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA TCAAGGCCCA CCTGGAAGAC 72 0 

CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG CGCACCGACC TCCCCCTGGA GGTGGACCTC 78 0 

GCCCAGGGGC GGGAGCCCGA CCGGGAGGGG CTTAGGGCCT TCCTGGAGAG GCTGGAGTTC 84 0 

GGCAGCCTCC TCCACGAGTT CGGCCTCCTG GAGGCCCCCG CCCCCCTGGA GGAGGCCCCC 900 

TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCT CCCGCCCCGA GCCCATGTGG 96 0 

GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG TGCACCGGGC AGCAGACCCC 102 0 

TTGGCGGGGC TAAAGGACCT CAAGGAGGTC CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC 108 0 

TTGGCCTCGA GGGAGGGGCT AGACCTCGTG CCCGGGGACG ACCCCATGCT CCTCGCCTAC 114 0 

CTCCTGGACC CCTCCAACAC CACCCCCGAG GGGGTGGCGC GGCGCTACGG GGGGGAGTGG 12 00 

ACGGAGGACG CCGCCCACCG GGCCCTCCTC TCGGAGAGGC TCCATCGGAA CCTCCTTAAG 12 6 0 

CGCCTCGAGG GGGAGGAGAA GCTCCTTTGG CTCTACCACG AGGTGGAAAA GCCCCTCTCC 132 0 

CGGGTCCTGG CCCACATGGA GGCCACCGGG GTACGGCTGG ACGTGGCCTA CCTTCAGGCC 13 8 0 

CTTTCCCTGG AGCTTGCGGA GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG 144 0 

GGCCACCCCT TCAACCTCAA CTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1500 

AGGCTTCCCG CCTTGGGGAA GACGCAAAAG ACAGGCAAGC GCTCCACCAG CGCCGCGGTG 1560 

CTGGAGGCCC TACGGGAGGC CCACCCCATC GTGGAGAAGA TCCTCCAGCA CCGGGAGCTC 162 0 

ACCAAGCTCA AGAACACCTA CGTGGACCCC CTCCCAAGCC TCGTCCACCC GAGGACGGGC 168 0 

CGCCTCCACA CCCGCTTCAA CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC 174 0 

CCCAACCTGC AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 18 00 

GTGGCCGAGG CGGGTTGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA GCTCCGCGTC 1860 

CTCGCCCACC TCTCCGGGGA CGAAAACCTG ATCAGGGTCT TCCAGGAGGG GAAGGACATC 192 0 

CACACCCAGA CCGCAAGCTG GATGTTCGGC GTCCCCCCGG AGGCCGTGGA CCCCCTGATG 198 0 

CGCCGGGCGG CCAAGACGGT GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC 2 04 0 

TCCCAGGAGC TTGCCATCCC CT AC G AGGAG GCGGTGGCCT TTATAGAGGC TACTTCCAAA 2100 

GCTTCCCCAA GGTGCGGGCC TGGATAGAAA AGAC CCTGGA GGAGGGGAGG AAGCGGGGCT 216 0 

ACGTGGAAAC CCTCTTCGGA AGAAGGCGCT ACGTGCCCGA CCTCAACGCC CGGGTGAAGA 222 0 

GCGTCAGGGA GGCCGCGGAG CGCATGGCCT TCAACATGCC CGTCCAGGGC ACCGCCGCCG 228 0 

ACCTCATGAA GCTCGCCATG GTGAAGCTCT TCCCCCGCCT CCGGGAGATG GGGGCCCGCA 234 0 

TGCTCCTCCA GGTCCACGAC GAGCTCCTCC TGGAGGCCCC CCAAGCGCGG GCCGAGGAGG 24 00 

TGGCGGCTTT GGCCAAGGAG GCCATGGAGA AGGCCTATCC CCTCGCCGTG CCCCTGGAGG 246 0 

TGGAGGTGGG GATGGGGGAG GACTGGCTTT CCGCCAAGGG TTAG 2504 
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(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
1 5 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val lie Val 
50 55 60 

Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly 
65 70 75 80 

Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu 

85 90 95 

Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu 

100 105 no 

Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys 
115 120 125 

Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Lys Asp 
130 135 140 

Leu Tyr Gin Leu Leu Ser Asp Arg lie His Val Leu His Pro Glu Gly 
145 150 155 160 

Tyr Leu lie Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 

165 170 175 

Asp Gin Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 

180 185 190 

Leu Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Arg Lys Leu Leu 
195 200 205 

Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 
210 215 220 

Lys Pro Ala lie Arg Glu Lys lie Leu Ala His Met Asp Asp Leu Lys 
225 230 235 240 

Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val 

245 250 255 

Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe 

260 265 270 

Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu 
275 280 285 
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Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly 
290 295 300 

Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp 
305 310 315 320 

Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro 

325 330 335 

Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu 

340 345 350 

Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro 
355 360 365 

Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn 
370 375 380 

Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu 
385 390 395 400 

Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu 

405 410 415 

Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu 

420 425 430 

Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly 
435 440 445 

Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala 
450 455 460 

Glu Glu lie Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His 
465 470 475 480 

Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp 

485 490 495 

Glu Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly Lys Arg 

500 505 510 

Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro lie 
515 520 525 

Val Glu Lys lie Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr 
530 535 540 

Tyr lie Asp Pro Leu Pro Asp Leu lie His Pro Arg Thr Gly Arg Leu 
545 550 555 560 

His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 

565 570 575 

Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly Gin 

580 585 590 

Arg lie Arg Arg Ala Phe lie Ala Glu Glu Gly Trp Leu Leu Val Ala 
595 600 605 

Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly 
610 615 620 

Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Arg Asp lie His Thr 
625 630 635 640 
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Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro 

645 650 655 

Leu Met Arg Arg Ala Ala Lys Thr lie Asn Phe Gly Val Leu Tyr Gly 

660 665 670 

Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr Glu Glu 
675 680 685 

Ala Gin Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg 

725 730 735 

Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 

740 745 750 

Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 

Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780' 

Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala 
785 790 795 800 

Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 

805 810 815 

Leu Glu Val Glu Val Gly He Gly Glu Asp Trp Leu Ser Ala Lys Glu 

820 825 830 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 831 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu Val 
1 5 10 * 15 

Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly Leu 

20 25 30 

Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala Lys 
35 40 45 

Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Val Val Val Val Val 
50 55 60 

Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu Ala Tyr 
65 70 75 80 

Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu Ala 

85 90 95 
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Leu lie Lys Glu Leu Val Asp Leu Leu Gly Leu Val Arg Leu Glu Val 

100 105 no 

Pro Gly Phe Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys Arg Ala 
115 120 125 

Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg Asp Leu 
130 135 140 

Tyr Gin Leu Leu Ser Glu Arg lie Ala lie Leu His Pro Glu Gly Tyr 
145 150 155 160 

Leu lie Thr Pro Ala Trp Leu Tyr Glu Lys Tyr Gly Leu Arg Pro Glu 

165 170 175 

Gin Trp Val Asp Tyr Arg Ala Leu Ala Gly Asp Pro Ser Asp Asn lie 

180 185 190 

Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Gin Arg Leu lie Arg 
195 200 205 

Glu Trp Gly Ser Leu Glu Asn Leu Phe Gin His Leu Asp Gin Val Lys 
210 215 220 

Pro Ser Leu Arg Glu Lys Leu Gin Ala Gly Met Glu Ala Leu Ala Leu 
225 230 235 240 

Ser Arg Lys Leu Ser Gin Val His Thr Asp Leu Pro Leu Glu Val Asp 

245 250 255 

Phe Gly Arg Arg Arg Thr Pro Asn Leu Glu Gly Leu Arg Ala Phe Leu 

260 265 270 

Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu Glu 
275 280 285 

Gly Pro Lys Ala Ala Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly Ala 
290 295 300 

Phe Leu Gly Phe Ser Phe Ser Arg Pro Glu Pro Met Trp Ala Glu Leu 
305 310 315 320 

Leu Ala Leu Ala Gly Ala Trp Glu Gly Arg Leu His Arg Ala Gin Asp 

325 330 335 

Pro Leu Arg Gly Leu Arg Asp Leu Lys Gly Val Arg Gly lie Leu Ala 

340 345 350 

Lys Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Asp Leu Phe Pro 
355 360 " 365 

Glu Asp Asp Pro Met Leu Leu -Ala Tyr Leu Leu Asp Pro Ser Asn Thr 
370 375 380 

Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu Asp 
385 390 395 400 

Ala Gly Glu Arg Ala Leu Leu Ala Glu Arg Leu Phe Gin Thr Leu Lys 

405 410 415 

Glu Arg Leu Lys Gly Glu Glu Arg Leu Leu Trp Leu Tyr Glu Glu Val 

420 425 430 

Glu Lys Pro Leu Ser Arg Val Leu Ala Arg Met Glu Ala Thr Gly Val 
435 440 445 
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Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu Val Glu Ala 
450 455 460 

Glu Val Arg Gin Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro 
465 470 475 480 

Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp Glu 

485 490 495 

Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly Lys Arg Ser 

500 505 510 

Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro lie Val 
515 520 525 

Asp Arg lie Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys Asn Thr Tyr 
530 535 540 

lie Asp Pro Leu Pro Ala Leu Val His Pro Lys Thr Gly Arg Leu His 
545 550 555 560 

Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser 

565 570 575 

Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly Gin Arg 

580 585 590 

lie Arg Arg Ala Phe Val Ala Glu Glu Gly Trp Val Leu Val Val Leu 
595 600 605 

Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly Asp 
610 615 620 

Glu Asn Leu lie Arg Val Phe Gin Glu Gly Arg Asp lie His Thr Gin 
625 630 635 640 

Thr Ala Ser Trp Met Phe Gly Val Ser Pro Glu Gly Val Asp Pro Leu 

645 650 655 

Met Arg Arg Ala Ala Lys Thr lie Asn Phe Gly Val Leu Tyr Gly Met 

660 665 670 

Ser Ala His Arg Leu Ser Gly Glu Leu Ser lie Pro Tyr Glu Glu Ala 
675 680 685 

Val Ala Phe lie Glu Arg Tyr Phe Gin Ser Tyr Pro Lys Val Arg Ala 
690 695 700 

Trp lie Glu Gly Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val Glu 
705 710 715 * 720 

Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val 

725 730 735 

Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro Val 

740 745 750 

Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Arg Leu Phe 
755 760 765 

Pro Arg Leu Gin Glu Leu Gly Ala Arg Met Leu Leu Gin Val His Asp 
770 775 780 

Glu Leu Val Leu Glu Ala Pro Lys Asp Arg Ala Glu Arg Val Ala Ala 
785 790 795 800 
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Leu Ala Lys Glu Val Met Glu Gly Val Trp Pro Leu Gin Val Pro Leu 

805 810 815 

Glu Val Glu Val Gly Leu Gly Glu Asp Trp Leu Ser Ala Lys Glu 

820 825 830 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNE SS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Glu Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 '45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 

85 90 95 

Leu Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 

100 105 no 

Glu Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

Lys Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 
130 135 140 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu 
145 150 155 160 

Gly His Leu lie Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Arg 

165 170 175 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 

180 185 190 

Asn Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Asn Leu Asp Arg 
210 215 220 

Val Lys Pro Glu Asn Val Arg Glu Lys He Lys Ala His Leu Glu Asp 
225 230 235 240 

Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 

245 250 255 
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-Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu Gly Leu Arg 

260 265 270 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg 

325 330 335 

Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly 

340 345 350 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp 
355 360 365 

Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 
385 390 395 400 

Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg 

405 410 415 

Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp Leu Tyr 

420 425 430 

His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
435 440 445 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu 
450 455 460 

Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 

485 490 495 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 

500 505 510 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 * 525 

Pro He Val Glu Lys He Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly 
545 550 555 560 

Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 

565 570 575 

Ser Ser Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg Thr Pro Leu 

580 585 590 

Gly Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 
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Val Ala Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Lys Asp He 
625 630 635 640 

His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val 

645 650 655 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 

660 665 670 

Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr 
675 680 685 

Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 

Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 
705 710 715 720 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 

725 730 735 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 

740 745 750 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 

Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 

805 810 815 

Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala 

820 825 830 

Lys Gly 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2502 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



ATGNNGG CGA 


TGCTTCCCCT 


CTTTGAGCCC 


AAAGGC CGGG 


TCCTCCTGGT 


GGACGGCCAC 


60 


CACCTGGCCT 


ACCGCACCTT 


CTTCGCCCTG 


AAGGGCCTCA 


CCACCAGCCG 


GGGCGAACCG 


120 


GTGCAGGCGG 


TCTACGGCTT 


CGCCAAGAGC 


CTCCTCAAGG 


CCCTGAAGGA 


GGACGGGGAC 


180 


NNGGCGGTGN 


TCGTGGTCTT 


TGACGCCAAG 


GCCCCCTCCT 


TCCGCCACGA 


GGCCTACGAG 


240 


GCCTACAAGG 


CGGGCCGGGC 


CCCCACCCCG 


GAGGACTTTC 


CCCGGCAGCT 


CGCCCTCATC 


3*00 
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AAGGAGCTGG 


\ TGGACCTCCT 


1 GGGGCTTGCG 


CGCCTCGAGG 


'< TCCCCGGPTA 




Ten 

J D U 


GACGTNCTGG 


CCACCCTGGC 


CAAGAAGGCG 


GAAAAGGAGG 


GGTACGAGGT 




4z u 


ACCGCCGACC 


GCGACCTCTA 


CCAGCTCCTT 


TCCGACCGCA 


, TCGCCGTCCT 


p r 1 3\ rr n r* n a 




GGGTACCTCA 


TCACCCCGGC 


GTGGCTTTGG 


GAGAAGTACG 


GCCTGAGGCC 


GGAG CARTnn 




GTGGACTACC 


GGGCCCTGGC 


GGGGGACCCC 


TCCGACAACC 


TCCCCGGGGT 


CAAGGGCATC 


con 

o u u 


GGGGAGAAGA 


CCGCCCNGAA 


GCTCCTCNAG 


GAGTGGGGGA 


GCCTGGAAAA 


CCTCCTCAAG 


D O \J 


AACCTGGACC 


GGGTGAAGCC 


CGCCNTCCGG 


GAGAAGATCC 


AGGCCCACAT 


GGANGAPPTG 




ANGCTCTCCT 


GGGAGCTNTC 


CCAGGTGCGC 


ACCGACCTGC 


CCCTGGAGGT 


GGACTTPGPP 


/ o u 


AAGNGGCGGG 


AGCCCGACCG 


GGAGGGGCTT 


AGGGCCTTTC 


TGGAGAGGCT 


GGAGTTTGGP 


Din 


AGCCTCCTCC 


ACGAGTTCGG 


CCTCCTGGAG 


GGCCCCAAGG 


CCCTGGAGGA 


GGCCCCCTGG 


900 


CCCCCGCCGG 


AAGGGGCCTT 


CGTGGGCTTT 


GTCCTTTCCC 


GCCCCGAGCC 


CATGTGGGCC 


960 


GAGCTTCTGG 


CCCTGGCCGC 


CGCCAGGGAG 


GGCCGGGTCC 


ACCGGGCACC 


AGACCCCTTT 


102 0 


ANGGGCCTNA 


GGGACCTNAA 


GGAGGTGCGG 


GGNCTCCTCG 


CCAAGGACCT 

• 


GGCCGTTTTG 


J.UOU 


GCCCTGAGGG 


AGGGCCTNGA 


CCTCNTGCCC 


GGGGACGACC 


CCATGCTCCT 


PGPPTAPPTP 




CTGGACCCCT 


CCAACACCAC 


CCCCGAGGGG 


GTGGCCCGGC 


GCTACGGGGG 


\j\srjrx\j x O urlLu 


i o n n 


GAGGANGCGG 


GGGAGCGGGC 


CCTCCTNTCC 


GAGAGGCTCT 


TCCNGAACCT 


NTsTNTGP An CCi P 


-L ^ D U 


CTTGAGGGGG 


AGGAGAGGCT 


CCTTTGGCTT 


TACCAGGAGG 


TGGAGAAGCC 


• XXX VhVvVVJU 


1 -ion 
x ^ ^ u 


GTCCTGGCCC 


ACATGGAGGC 


CACGGGGGTN 


CGGCTGGACG 


TGGCCTACCT 


PPAnGPPPTKT 


i Ton 
IjOU 


TCCCTGGAGG 


TGGCGGAGGA 


GATCCGCCGC 


CTCGAGGAGG 


AGGTCTTCCG 


CCTGGPPGGP 


t Ail n 


CACCCCTTCA 


ACCTCAACTC 


CCGGGACCAG 


CTGGAAAGGG 


TGCTCTTTGA 




1 jUU 


CTTCCCGCCA 


TCGGCAAGAC 


GGAGAAGACN 


GGCAAGCGCT 


CCACCAGCGC 


CGCCGTGCTG 


± j u u 


GAGGCCCTNC 


GNGAGGCCCA 


CCCCATCGTG 


GAGAAGATCC 


TGCAGTACCG 


GGAGCTCACC 


1 con 


AAGCTCAAGA 


ACACCTACAT 


NGACCCCCTG 


CCNGNCCTCG 


TCCACCCCAG 


GACGGGPPGP 


i con 

-L O O U 


CTCCACACCC 


GCTTCAACCA 


GACGGCCACG 


GCCACGGGCA 


GGCTTAGTAG 


PTPPf^APPPP 


T *7 a n 


AACCTGCAGA 


ACATCCCCGT 


CCGCACCCCN 


CTGGGCCAGA 


GG AT d CCZ C C d 




n D r* n 


GCCGAGGAGG 


GNTGGGTGTT 


GGTGGCCCTG 


GACTATAGC C 


AG AT AG AG C T 




-1 D/TA 

looU 


GCCCACCTCT 


CCGGGGACGA 


GAACCTGATC 


CGGGTCTTCC 


AGGAGGGGAG 


GC1 A P A TPP A P 


T q o n 


ACCCAGACCG 


CCAGCTGGAT 


GTTCGGCGTC 


CCCCCGGAGG 


CCGTGGACCC 


CCTGATGCGP 


icon 

-L _7 O VJ 


CGGGCGGCCA 


AGACCATCAA 


CTTCGGGGTC 


CTCTACGGCA 


TGTCCGCCCA 


CCGCCTCTCC 




CAGGAGCTTG 


CCATCCCCTA 


CGAGGAGGCG 


GTGGCCTTCA 


TTGAGCGCTA 


CTTCCAGAGC 


2100 

-1- W W 


TTCCCCAAGG 


TGPGGGPPTG 


fi ATTfl a a a zv p. 


.«.v_v„\_ 1 bbAub 


AGGGCAGGAG 


GCGGGGGTAC 


2160 


GTGGAGACCC 


TCTTCGGCCG 


CCGGCGCTAC 


GTGCCCGACC 


TCAACGCCCG 


GGTGAAGAGC 


2220 


GTGCGGGAGG 


CGGCGGAGCG 


CATGGCCTTC 


AACATGCCCG 


TCCAGGGCAC 


CGCCGCCGAC 


2280 

* 


CTCATGAAGC 


TGGCCATGGT 


GAAGCTCTTC 


CCCCGGCTNC 


AGGAAATGGG 


GGCCAGGATG 


2340 
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CTCCTNCAGG TCCACGACGA GCTGGTCCTC GAGGCCCCCA AAGAGCGGGC GGAGGNGGTG 24 0 0 

GCCGCTTTGG CCAAGGAGGT CATGGAGGGG GTCTATCCCC TGGCCGTGCC CCTGGAGGTG 24 6 0 

GAGGTGGGGA TGGGGGAGGA CTGGCTCTCC GCCAAGGAGT AG 2 5 02 
(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Xaa Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 *45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Xaa Val 
50 55 60 

Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu Ala 
65 70 75 80 

Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu 

85 90 95 

Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Leu Xaa Arg Leu Glu 

100 105 110 

Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys Lys 
115 120 125 

Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg Asp 
130 135 140 

Leu Tyr Gin Leu Leu Ser Asp Arg lie Ala Val Leu His Pro Glu Gly 
145 150 155 160 

Tyr Leu lie Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 

165 170 175 

Glu Gin Trp Val Asp Tyr Arg Ala Leu Xaa Gly Asp Pro Ser Asp Asn 

180 185 190 

Leu Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Xaa Lys Leu Leu 
195 200 205 

Xaa Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Asn Leu Asp Arg Val 
210 215 220 

Lys Pro Xaa Xaa Arg Glu Lys lie Xaa Ala His Met Glu Asp Leu Xaa 
225 230 235 240 

Leu Ser Xaa Xaa Leu Ser Xaa Val Arg Thr Asp Leu Pro Leu Glu Val 

245 250 255 
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Asp Phe Ala Xaa 

260 

Leu Glu Arg Leu 
275 

Glu Xaa Pro Lys 
290 

Ala Phe Val Gly 
305 

Leu Leu Ala Leu 



Asp Pro Leu Xaa 

340 

Ala Lys Asp Leu 
355 

Pro Gly Asp Asp 
370 

Thr Thr Pro Glu 
385 

Asp Ala Gly Glu 



Xaa Xaa Arg Leu 

420 

Val Glu Lys Pro 
435 

Val Arg Leu Asp 
450 

Glu Glu lie Arg 
465 

Pro Phe Asn Leu 



Glu Leu Gly Leu 

500 

Ser Thr Ser Ala 
515 

Val Glu Lys lie 
530 

Tyr lie Asp Pro 
545 

His Thr Arg Phe 



Ser Asp Pro Asn 

580 

Arg He Arg Arg 
595 



Arg Arg Glu Pro 



Glu Phe Gly Ser 

280 

Ala Leu Glu Glu 
295 

Phe Val Leu Ser 
310 

Ala Ala Ala Arg 
325 

Gly Leu Arg Asp 



Ala Val Leu Ala 

360 

Pro Met Leu Leu 
375 

Gly Val Ala Arg 
390 

Arg Ala Leu Leu 
405 

Glu Gly Glu Glu 



Leu Ser Arg Val 

440 

Val Ala Tyr Leu 
455 

Arg Leu Glu Glu 
470 

Asn Ser Arg Asp 
485 

Pro Ala He Gly 



Ala Val Leu Glu 

520 

Leu Gin Tyr Arg 
535 

Leu Pro Xaa Leu 
550 

Asn Gin Thr Ala 
565 

Leu Gin Asn He 



Ala Phe Val Ala 

600 



Asp Arg Glu Gly 
265 

Leu Leu His Glu 



Ala Pro Trp Pro 

300 

Arg Pro Glu Pro 
315 

Xaa Gly Arg Val 
330 

Leu Lys Glu Val 
345 

Leu Arg Glu Gly 



Ala Tyr Leu Leu 

380 

Arg Tyr Gly Gly 
395 

Ser Glu Arg Leu 
410 

Arg Leu Leu Trp 
425 

Leu Ala His Met 



Gin Ala Leu Ser 

460 

Glu Val Phe Arg 
475 

Gin Leu Glu Arg 
490 

Lys Thr Glu Lys 
505 

Ala Leu Arg Glu 



Glu Leu Thr Lys 

540 

Val His Pro Arg 
555 

Thr Ala Thr Gly 
570 

Pro Val Arg Thr 
585 

Glu Glu Gly Trp 



Leu Arg Ala Phe 
270 

Phe Gly Leu Leu 
285 

Pro Pro Glu Gly 



Met Trp Ala Glu 

320 

His Arg Ala Xaa 
335 

Arg Gly Leu Leu 
350 

Leu Asp Leu Xaa 
365 

Asp Pro Ser Asn 



Glu Trp Thr Glu 

400 

Phe Xaa Asn Leu 
415 

Leu Tyr Xaa Glu 
430 

Glu Ala Thr Gly 
445 

Leu Glu Val Ala 



Leu Ala Gly His 

480 

Val Leu Phe Asp 
495 

Thr Gly Lys Arg 
510 

Ala His Pro He 
525 

Leu Lys Asn Thr 



Thr Gly Arg Leu 

560 

Arg Leu Ser Ser 
575 

Pro Leu Gly Gin 
. 590 

Xaa Leu Val Ala 
605 
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Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly 
610 615 620 

Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Arg Asp lie His Thr 
625 630 635 640 

Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val Asp Pro 

645 650 655 

Leu Met Arg Arg Ala Ala Lys Thr lie Asn Phe Gly Val Leu Tyr Gly 

660 665 670 

Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr Glu Glu 
675 680 685 

Ala Val Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg 

725 730 735 

Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 

740 745 750 

Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 

Phe Pro Arg Leu Xaa Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780 

Asp Glu Leu Val Leu Glu Ala Pro Lys Xaa Arg Ala Glu Xaa Val Ala 
785 790 795 800 

Ala Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 

805 810 815 

Leu Glu Val Glu Val Gly Xaa Gly Glu Asp Trp Leu Ser Ala Lys Glu 

820 825 830 

Xaa 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 164 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ATGAATTCGG GGATGCTGCC CCTCTTTGAG CCCAAGGGCC GGGTCCTCCT GGTGGACGGC 6 0 

CACCACCTGG CCTACCGCAC CTTCCACGCC CTGAAGGGCC TCACCACCAG CCGGGGGGAG 12 0 

CCGGTGCAGG CGGTCTACGG CTTCGCCAAG AGCCTCCTCA AGGCCCTCAA GGAGGACGGG 18 0 

GACGCGGTGA TCGTGGTCTT TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGGG 24 0 

GGGTACAAGG CGGGCCGGGC CCCCACGCCG GAGGACTTTC CCCGGCAACT CGCCCTCATC 3 00 

AAGGAGCTGG TGGACCTCCT GGGGCTGGCG CGCCTCGAGG TCCCGGGCTA CGAGGCGGAC 3 60 

-92- 



WO 94/29482 



PCT/US94/06253 



GACGTCCTGG CCAGCCTGGC CAAGAAGGCG GAAAAGGAGG GCTACGAGGT CCGCATCCTC 4 20 

ACCGCCGACA AAGAC CTTTA CCAGCTCCTT TCCGACCGCA TCCACGTCCT CCACCCCGAG 48 0 

^GGGTACCTCA TCACCCCGGC CTGGCTTTGG GAAAAGTACG GCCTGAGGCC CGACCAGTGG 54 0 

GCCGACTACC GGGCCCTGAC CGGGGACGAG TCCGACAACC TTCCCGGGGT CAAGGGCATC 6 00 

GGGGAGAAGA CGGCGAGGAA GCTTCTGGAG GAGTGGGGGA GCCTGGAAGC CCTCCTCAAG 66 0 

AACCTGGACC GGCTGAAGCC CGCCATCCGG GAGAAGATCC TGGCCCACAT GGACGATCTG 72 0 

AAGCTCTCCT GGGACCTGGC CAAGGTGCGC ACCGACCTGC CCCTGGAGGT GGACTTCGCC 78 0 

AAAAGGCGGG AGCCCGACCG GGAGAGGCTT AGGGCCTTTC TGGAGAGGCT TGAGTTTGGC 84 0 

AGCCTCCTCC ACGAGTTCGG CCTTCTGGAA AGCCCCAAGG CCCTGGAGGA GGCCCCCTGG 90 0 

CCCCCGCCGG AAGGGGCCTT CGTGGGCTTT GTGCTTTCCC GCAAGGAGCC CATGTGGGCC 96 0 

GATCTTCTGG CCCTGGCCGC CGCCAGGGGG GGCCGGGTCC ACCGGGCCCC CGAGCCTTAT 102 0 

AAAGCCCTCA GGGACCTGAA GGAGGCGCGG GGGCTTCTCG CCAAAGACCT GAGCGTTCTG 108 0 

GCCCTGAGGG AAGGCCTTGG CCTCCCGCCC GGCGACGACC CCATGCTCCT CGCCTACCTC 114 0 

CTGGACCCTT CCAACACCAC CCCCGAGGGG GTGGCCCGGC GCTACGGCGG GGAGTGGACG 12 00 

GAGGAGGCGG GGGAGCGGGC CGCCCTTTCC GAGAGGCTCT TCGCCAACCT GTGGGGGAGG 126 0 

CTTGAGGGGG AGGAGAGGCT CCTTTGGCTT TACCGGGAGG TGGAGAGGCC CCTTTCCGCT 132 0 

GTCCTGGCCC ACATGGAGGC CACGGGGGTG CGCCTGGACG TGGCCTATCT CAGGGCCTTG 138 0 

TCCCTGGAGG TGGCCGGGGA GATCGCCCGC CTCGAGGCCG AGGTCTTCCG CCTGGCCGGC 144 0 

CACCCCTTCA ACCTCAACTC C CGGGAC C AG CTGGAAAGGG TCCTCTTTGA CGAG CTAGGG 15 00 

CTTCCCGCCA TCGGCAAGAC GGAGAAGACC GGCAAGCGCT CCACCAGCGC CGCCGTCCTG 1560 

GAGGCCCTCC GCGAGGCCCA CCCCATCGTG GAGAAGATCC TGCAGGCATG CAAGCTTGGC 162 0 

ACTGGCCGTC GTTTTACAAC GTCGTGA 164 7 
(2) INFORMATION FOR SEQ ID NO; 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATGAATTCGG GGATGCTGCC CCTCTTTGAG CCCAAGGGCC GGGTCCTCCT GGTGGACGGC 6 0 

CACCACCTGG CCTACCGCAC CTTCCACGCC CTGAAGGGCC TCACCACCAG CCGGGGGGAG 12 0 

CCGGTGCAGG CGGTCTACGG CTTCGCCAAG AGCCTCCTCA AGGCCCTCAA GGAGGACGGG 180 

GACGCGGTGA TCGTGGTCTT TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGGG 24 0 

GGGTACAAGG CGGGCCGGGC CCCCACGCCG GAGGACTTTC CCCGGCAACT CGCCCTCATC 3 00 

AAGGAGCTGG TGGACCTCCT GGGGCTGGCG CGCCTCGAGG TCCCGGGCTA CGAGGCGGAC 36 0 
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GACGTCCTGG 


CCAGCCTGGC 


C AAG AAG G C G 


a a a a a a a zv 


f~* i 1 a f^i zi r~* (™* T" 


Ct^tjL-HLX v_(— X L. 


yl 1 A 


ACCGCCGACA 


AAGACCTTTA 


CCAGCTCCTT 


TCCGACCGCA 


TCCACGTCCT 


P PAP P P PG AG 


4 R 0 

O u 


GGGTACCTCA 


TCACCCCGGC 


CTGGCTTTGG 


GAAAAGTACG 


GCCTGAGGCC 

^bbbI ^bjb* ^igpB BBW- 1b# bW bV^bI ^iBBT' 


CGAC CAGTGG 

W W^l>W ^v^f^^pB* bL 


54 0 

—J w 


GCCGACTACC 


GGGCCCTGAC 


CGGGGACGAG 


TCCGACAACC 


TTCCCGGGGT 


C AAG G G PAT P 


V W kJ 


GGGGAGAAGA 

^BBP ^BPP ^BBP 4 b*j — A ^bj ^BJB* W 


CGGCGAGGAA 

^BjbP *«bP ^bb* ^tapF -BB BBt **bP ^Bjrf A -BbBBi A 


GCTTCTGGAG 

^p* ^P* bp^ ^bP ^B0 ^flprBBi aki^^Br 


GAGTGGGGGA 


GPPTGGAAGP 


PPTPPTP A AfS 


^ C Pi 


AACCTGGACC 

«■ ^pp ^pppp* ^p* ^p* 


GGCTGAAGCC 

^BP) ^B* ^B* BBBf- ^■B* B* BBBBr * ^BP 1 ^B* 


CGCCATCCGG 

^ P ^1 P ^B*'**' * VPpi ^B* ^paP' ^b* ^bJ 


GAGAAGATCC 


TGGPPPAPAT 


an a nr. a tpth 


TOO 


AAGCTCTCCT 

■» P»bV * " ^ H Pb -bb^ -JbbI* «^bP B»» 


GGGACCTGGC 


CAAGGTGCGC 


ACCGACCTGC 


PPPTGGAGC3T 

> — V — > — ± VJu/lOU X 




-top 

/ o U 


AAAAGGCGGG 


AGCCCGACCG 


GGAGAGGCTT 


AGGGCCTTTC 

* ^BB 1 W VJ \K PbW bA. bJU Np_* 


TGGAGAGGPT 


X unVJ XXX Ow^w. 


04 Pi 


AGCCTCCTCC 


ACGAGTTCGG 

BBL BBl ^BBP* ^BBPT BP * ^BB* BBBP Bl ^B* ^BBP 1 ^BM* 


CCTTCTGGAA 

^B» ^BHP 1 «B» BBBt «■> ^pf ^taj ■*» -dhrih rib 


AGCCCCAAGG 


PPPTGGAGGA 


a a p p p r 1 r 1 Tan 


onn 

27 U U 


CCCCCGCCGG 


AAGGGGCCTT 

«■> ^BBBp BB^BHP 1 ^BHB> ^B^ ^BBB* «B* ^BP 1 BBB» 


CGTGGGCTTT 

^BP 1 ^B* BBBh ^fpP ^B* ^BP< PB^ 4p *t"p 


GTGCTTTCCC 

^•ppp W ^» -la «i> pLi ^p Nhi 


GP AAGGAGPP 




q £i n 


GATCTTCTGG 


CCCTGGCCGC 

^PP Vp* ^ *kppJ ^bJ ^pP-^J ^pP 


CGCCAGGGGG 

NvJ ^b^ WJT3>W ^Jl VbbI \mJ 


GGCCGGGTCP 






t n *■> n 


AAAGCCCTCA 

bV A "tap* ^bbP "tap* T^pp bbb Ttap*^r A 


GGGACCTGAA 


GGAGGCGCGG 


GGGCTTCTPG 


PP A A anaPPT 




n n d n 


GCCCTGAGGG 

^BJ ■ *BJB* ^B^ BBBI WbTb>W ^ J 


AAGGCCTTGG 

* *A AW W ^pp> bbL b^ N_J 


CCTCCCGCCC 

^B* W «k ^B* ^B> ^BP %kBf ^B« W 




v — v_. x vj\_ ill. 1 


f^p , P t P ir T 1 7A r^r^TT 1 


XX4 U 


CTGGACCCTT 


CCAACACCAC 


CCCCGAGGGG 

>H> ^BP ^BBP ^BB» 




nPTAPPPPrsn 


vaUj/iVa X ^jVj/i.L.tj 


t o n n 


GAGGAGGCGG 


GGGAGCGGGC 








X tatjOj^LjAVjCj 


126 0 


CTTGAGGGGG 

Vb>> JL JL WbTbVWw w WW 


AGGAGAGGCT 


C?C?T r rTGGP r TT 






V-L. XXX L.L.£jU I 




GTCCTGGCCC 

W J. V ' JL ^.bI^BbT *W ^p\-P» 


ACATGGAGGC 






X Lj*jV_v_ X AX L_ X 


A i"« /"trprp/-* 
CACjGGCCTTG 


13 8 0 


TCCCTGGAGG 


TGGCCGGGGA 


GATCGCCCGC 

V^bT^ X WW W WWNpT w 




■nOO X V- X X 




144U 


CACCCCTTCA 

^B*"l Pfc^pP ^P* ^pp BBB» BBB» ^B^ BB^PPi 


ACCTCAACTC 

•S^pBi W ^t*« *bV ^B*****^^ BBPJ ^BBi 


CCGGGACCAG 


CTGGAAAGGG 


X X ^ X X X vjjHi. 




lb U U 


CTTCCCGCCA 


TCGGCAAGAC 

^Ba ^pp- *■ J *B^- J bbpI V_f J pV 


GGAGAAGACC 

^B* Xj^TLl^J JC^BBBl*\kBlBf^BW^kBP 


gg CAAGrnr t 




P 1 P* P* P" T 1 r* TP 




GAGGCCCTCC 


GCGAGGCCCA 


CCCCATCGTG 


GAGAAGATPP 


TGPZinTZ\PPf2 


bbAuL X 


lb 2. U 


AAGCTGAAGA 

BBl ^BBBl bPjVjPb* 1 ™*»P BB» ^BB# b> B>l ^p# Pb- A 


GCACCTACAT 

^B# ^B»4 B* ^BB- ^BP 1 BBB» aft «fc ^BBBV'Bv A bbW 


TGACCCCTTG 

«bV ^pf flC^Bb^p* W ^BP 1 ^B* BBBP JW VJF 


CCGGAPPTfA 


tpp APPrr'An 




t c o n 
IbbU 


CTCCACACCC 


GCTTCAACCA 


GACGGCCACG 

^B^oE^BB^BBi ^>bP ^Bflt W ^p>f^^«pp ^BB* 




nnPTL ArtT* AP« 




1 /4 U 


AACCTCCAGA 


ACATCCCCGT 


CCGCACCCCG 


CTTGGGCAGA 


GGATPPGPPG 


nPPPTTPETP 
uuLL X X LAI V_ 


n o n n 


GCCGAGGAGG 

^B" ™*BP' ^BP' ^BBBP BBl P> ^^B*BB] N(P ^B* ™%pB> 


GGTGGCTATT 

^BP 1 ^BBP* BBBP ^BBBT ^BBB* VB> BBB» BB Bl BBBh BBBt 


GGTGGCCCTG 


GACT ATAG G G 


AGATAf^AftinT 


P 7A P2P1P TP PTP 
LHuou X uL X o 


lob U 


^b*Mb) ^Bal ^BlBi ^bPbP* BBL ,bBM BiVBk BPBHBj 

GCCCACCTCT 


CCGGCGACGA 


GAACCTGATC 


CGGGTCTTCC 


AGGAGGGGCG 


GGACATCCAC 


1920 


ACGGAGACCG 


CCAGCTGGAT 


GTTCGGCGTC 


CCCCGGGAGG 


CCGTGGACCC 


CCTGATGCGC 


1980 


CGGGCGGCCA 


AGACCATCAA 


CTTCGGGGTC 


CTCTACGGCA 


TGTCGGCCCA 


CCGCCTCTCC 


2040 


CAGGAGCTAG 


CTAGCCATCC 


CTTACGAGGA 


GGCCCAGGCC 


TTCATTGA 




2088 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATGAATTCGG 


GGATGCTGCC 


CCTCTTTGAG 


CCCAAGGGCC 


GGGTCCTCCT 




D U 


CACCACCTGG 


CCTACCGCAC 


CTTCCACGCC 


CTGAAGGGCC 


TCACCACCAR 

x wXiu 




_L ^ U 


CCGGTGCAGG 


CGGTCTACGG 


CTTCGCCAAG 


AGCCTCCTCA 


AGGCCCTCAA 




inn 
J_ o u 


GACGCGGTGA 


TCGTGGTCTT 


TGACGCCAAG 


GCCCCCTCCT 


TCCGCCACGA 




i a n 

f± U 


GGGTACAAGG 


CGGGCCGGGC 


CCCCACGCCG 


GAGGACTTTC 


CCCGGCAACT 




o u u 


AAGGAGCTGG 


TGGACCTCCT 


GGGGCTGGCG 


CGCCTCGAGG 


TCCCGGGCTA 


CGAGGCGGAP 


J D U 


GACGTCCTGG 


CCAGCCTGGC 


CAAGAAGGCG 


GAAAAGGAGG 


GCTACGAGGT 


CCGCATCCTC 


\J 


ACCGCCGACA 


AAGACCTTTA 


CCAGCTTCTT 


TCCGACCGCA 


TCCACGTCCT 


CCACCCCGAG 


48 0 


GGGTACCTCA 


TCACCCCGGC 


CTGGCTTTGG 


GAAAAGTACG 


GCCTGAGGCC 


CGACCAGTGG 


540 


GCCGACTACC 


GGGCCCTGAC 


CGGGGACGAG 


TCCGACAACC 


TTCCCGGGGT 


CAAGGGCATC 


600 


GGGGAGAAGA 


CGGCGAGGAA 


GCTTCTGGAG 


GAGTGGGGGA 


GCCTGGAAGC 


CCTCCTCAAG 


O V V/ 


AACCTGGACC 


GGCTGAAGCC 


CGCCATCCGG 


GAGAAGATCC 


TGGCCCACAT 


OvXrt.V^Vj.tt.x ^, J. 


/ U 


AAGCTCTCCT 


*fc m 

GGGACCTGGC 


CAAGGTGCGC 


ACCGACCTGC 


CCCTGGAGGT 


GGACTTCGCC 


780 


AAAAGGCGGG 


AGCCCGACCG 


GGAGAGGCTT 


AGGGCCTTTC 


TGGAGAGGCT 


TGAGTTTGGC 


840 


AGCCTCCTCC 


ACGAGTTCGG 


CCTTCTGGAA 


AGCCCCAAGT 


CATGGAGGGG 


GTGTATCCCC 


900 


TGGCCGTGCC 


CCTGGAGGTG 


GAGGTGGGGA 


TAGGGGAGGA 


CTGGCTCTCC 


GCCAAGGAGT 


960 


GA 












962 



(2) INFORMATION FOR SEQ ID NO : 12 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



ATGGAATTCG 


GGGATGCTGC 


CCCTCTTTGA 


GCCCAAGGGC 


CGGGTCCTCC 

• 


TGGTGGACGG 


60 


CCACCACCTG 


GCCTACCGCA 


CCTTCCACGC 


CCTGAAGGGC 


CTCACCACCA 


GCCGGGGGGA 


120 


GCCGGTGCAG 


GCGGTCTACG 


GCTTCGCCAA 


GAGCCTCCTC 


AAGGCCCTCA 


AGGAGGACGG 


180 


GGACGCGGTG 


ATCGTGGTCT 


TTGACGCCAA 


GGCCCCCTCC 


TTCCGCCACG 


AGGCCTACGG 


240 


GGGGTACAAG 


GCGGGCCGGG 


CCCCCACGCC 


GGAGGACTTT 


CCCCGGCAAC 


TCGCCCTCAT 


300 


CAAGGAGCTG 


GTGGACCTCC 


TGGGGCTGGC 


GCGCCTCGAG 


GTCCCGGGCT 


ACGAGGCGGA 


360 


CGACGTCCTG 


GCCAGCCTGG 


CCAAGAAGGC 


GGAAAAGGAG 


GGCTACGAGG 


TCCGCATCCT 


420 


CACCGCCGAC 


AAAGACCTTT 


ACCAGCTCCT 


TTCCGACCGC 


ATCCACGTCC 


TCCACCCCGA 


480 


GGGGTACCTC 


ATCACCCCGG 


CCTGGCTTTG 


GGAAAAGTAC 


GGCCTGAGGC 


CCGACCAGTG 


540 


GGCCGACTAC 


CGGGCCCTGA 


CCGGGGACGA 


GTCCGACAAC 


CTTCCCGGGG 


TCAAGGGCAT 


600 
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CGGGGAGAAG ACGGCGAGGA AGCTTCTGGA GGAGTGGGGG AGCCTGGAAG CCCTCCTCAA 66 0 

GAACCTGGAC CGGCTGAAGC CCGCCATCCG GGAGAAGATC CTGGCCCACA TGGACGATCT 72 0 

GAAGCTCTCC TGGGACCTGG CCAAGGTGCG CACCGACCTG CCCCTGGAGG TGGACTTCGC 780 

CAAAAGGCGG GAGCCCGACC GGGAGAGGCT TAGGG CCTTT CTGGAGAGGC TTGAGTTTGG 84 0 

CAGCCTCCTC CACGAGTTCG GCCTTCTGGA AAGCCCCAAG ATCCGCCGGG CCTTCATCGC 9 00 

CGAGGAGGGG TGGCTATTGG TGGCCCTGGA CTATAGCCAG ATAGAGCTCA GGGTGCTGGC 96 0 

CCACCTCTCC GGCGACGAGA ACCTGATCCG GGTCTTCCAG GAGGGGCGGG ACATCCACAC 102 0 

GGAGACCGCC AGCTGGATGT TCGGCGTCCC CCGGGAGGCC GTGGACCCCC TGATGCGCCG 1080 

GGCGGCCAAG ACCATCAACT TCGGGGTCCT CTACGGCATG TCGGCCCACC GCCTCTCCCA 114 0 

GGAGCTAGCC ATCCCTTACG AGGAGGCCCA GGCCTTCATT GAGCGCTACT TTCAGAGCTT 12 00 

CCCCAAGGTG CGGGC CTGGA TTGAGAAGAC CCTGGAGGAG GGCAGGAGGC GGGGGTACGT 1260 

GGAGACCCTC TTCGGCCGCC GCCGCTACGT GCCAGACCTA GAGGCCCGGG TGAAG AG CGT 132 0 

GCGGGAGGCG GCCGAGCGCA TGGCCTTCAA CATGCCCGTC CGGGGCACCG CCGCCGACCT 138 0 

CATGAAGCTG GCTATGGTGA AGCTCTTCCC CAGGCTGGAG GAAATGGGGG CCAGGATGCT 144 0 

CCTTCAGGTC CACGACGAGC TGGTCCTCGA GGCCCCAAAA GAGAGGGCGG AGGCCGTGGC 150 0 

CCGGCTGGCC AAGGAGGTCA TGGAGGGGGT GTATCCCCTG GCCGTGCCCC TGGAGGTGGA 156 0 

GGTGGGGATA GGGGAGGACT GGCTCTCCGC CAAGGAGTGA 16 00 
(2) INFORMATION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CACGAATTCG GGGATGCTGC CCCTCTTTGA GCCCAA 36 
(2) INFORMATION FOR SEQ ID NO: 14: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
GTGAGATCTA TCACTCCTTG GCGGAGAGCC AGTC 1 34 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TAATACGACT CACTATAGGG AGACCGGAAT TCGAGCTCGC CCGGGCGAGC TCGAATTCCG 60 
TGTATTCTAT AGTGTCACCT AAATCGAATT C 91 
(2) INFORMATION FOR SEQ ID NO : 16 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TAATACGACT CACTATAGGG 2 0 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GAATTCGATT TAGGTGACAC TATAGAA 2 7 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GTAATCATGG TCATAGCTGG TAGCTTGCTA C 31 
(2) INFORMATION FOR SEQ ID NO : 19 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GGATCCTCTA GAGTCGACCT GCAGGCATGC CTACCTTGGT AG 42 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
GGATCCTCTA GAGTCGACCT GCAGGCATGC 30 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2502 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



ATGAATTCGG 


GGATGCTGCC 


CCTCTTTGAG 


CCCAAGGGCC 


GGGTCCTCCT 


GGTGGACGGC 


60 


CACCACCTGG 


CCTACCGCAC 


CTTCCACGCC 


CTGAAGGGCC 


TCACCACCAG 


CCGGGGGGAG 


120 


CCGGTGCAGG 


CGGTCTACGG 


CTTCGCCAAG 


AGCCTCCTCA 


AGGCCCTCAA 


GGAGGACGGG 


180 


GACGCGGTGA 


TCGTGGTCTT 


TGACGCCAAG 


GCCCCCTCCT 


TCCGCCACGA 


GGCCTACGGG 


240 


GGGTACAAGG 


CGGGCCGGGC 


CCCCACGCCG 


GAGGACTTTC 


CCCGGCAACT 


CGCCCTCATC 


300 


AAGG AG CTGG 


TGGACCTCCT 


GGGGCTGGCG 


CGCCTCGAGG 


TCC CGGGCTA 


CGAGGCGGAC 


360 


GACGTCCTGG 


CCAGCCTGGC 


CAAGAAGGCG 


GAAAAGGAGG 


GCTACGAGGT 


CCGCATCCTC 


420 


ACCGCCGACA 


AAGACCTTTA 


CCAGCTCCTT 


TCCGACCGCA 


TCCACGTCCT 


CCACCCCGAG 


480 


GGGTACCTCA 


TCACCCCGGC 


CTGGCTTTGG 


GAAAAGTACG 


GCCTGAGGCC 


CGACCAGTGG 


540 


GCCGACTACC 


GGGCCCTGAC 


CGGGGACGAG 


TCCGACAACC 


TTCCCGGQGT 


CAAGGGCATC 


600 


GGGGAGAAGA 


CGGCGAGGAA 


GCTTCTGGAG 


GAGTGGGGGA 


GCCTGGAAGC 


CCTCCTCAAG 


660 


AACCTGGACC 

* 


GGCTGAAGCC 


CGCCATCCGG 


GAGAAGATCC 


TGGCCCACAT 


GGACGATCTG 


720 


AAGCTCTCCT 


GGGACCTGGC 


CAAGGTGCGC 


ACCGACCTGC 


CCCTGGAGGT 


GGACTTCGCC 


780 


AAAAGGCGGG 


AGCCCGACCG 


GGAGAGGCTT 


AGGGCCTTTC 


TGGAGAGGCT 


TGAGTTTGG C 


840 


AGCCTCCTCC 


ACGAGTTCGG 


CCTTCTGGAA 


AGCCCCAAGG 


CCCTGGAGGA 


GGCCCCCTGG 


900 


CCCCCGCCGG 


AAGGGGCCTT 


CGTGGGCTTT 


GTGCTTTCCC 


GCAAGGAGCC 


CATGTGGGCC 


960 


GATCTTCTGG 


CCCTGGCCGC 


CGC CAGGGGG 


GGCCGGGTCC 


ACCGGGCCCC 


CGAGCCTTAT 


1020 


AAAGCCCTCA 


GGGACCTGAA 


GGAGGCGCGG 


GGGCTTCTCG 


CCAAAGACCT 


GAGCGTTCTG 


1080 

• 


GCCCTGAGGG 


AAGGCCTTGG 


CCTCCCGCCC 


GGCGACGACC 


CCATGCTCCT 


CGCCTACCTC 


1140 



-98- 



WO 94/29482 



PCT/US94/06253 



CTGGACCCTT 


CCAACACCAC 


CCCCGAGGGG 




a c* t z\ ccr* rr* r* 




-1 ^ r\ r\ 

12 0 0 


GAGGAGGCGG 


GGGAGCGGGC 


CGCCCTTTCC 


GAG AGG P TGT 


TPGrraziPr'T 


f 1 fp f*» r~* /™* 7v /~* 


1260 


CTTGAGGGGG 


AGGAGAGGCT 


CCTTTGGCTT 


TACrGGGAGG 


X ^J-\\Jjh\\J \J L, V_ 


^-V- 11 J. L.C,\jfL- 1 


t t 1 n 


GTCCTGGCCC 


ACATGGAGGC 


CACGGGGGTG 


CGCCTGGACG 


TGGGGTaTGT 

X VJVJ ^— ^_ X x*X X \_ X 


Vwi-iVaVjiLjV—L- x x 


n "3 q n 


TCCCTGGAGG 


TGGC CGGGGA 


GATCGCCCGC 


CTCGAGGCCG 


AGGTGTTGGG 






CACCCCTTCA 


ACCTCAACTC 


CCGGGACCAG 


CTGGAAAGGG 


TCCTCTTTGA 




-1 c n n 


CTTCCCGCCA 


TCGGCAAGAC 


GGAGAAG AC C 


GGCAAGCGCT 

^■J ^p-» ^p^pC^pWX^^\«pJ w \j w JL 




PGPPGTPPTP 


1 CCA 


GAGGCCCTCC 

^MVM BPJ VBBT ^BJV ^ppl BBBb ^^pv? ^b* 1 


GCGAGGCCCA 


CCCCATCGTG 


GAGAAGATCC 


TGPAGTAGrG 


Gf3ZinPTPZ\PP 




AAGCTGAAGA 


GCACCTACAT 


TGACCCCTTG 


CCGGACCTCA 


TCCACCCCAG 


GAPGGGPPGP 


1 con 

IDOU 


CTCCACACCC 


GCTTCAACCA 


GACGGCCACG 


GCCACGGGCA 


GGCTAAGTAG 


CTCCGATCCC 


1740 


AACCTCCAGA 


ACATCCCCGT 


CCGCACCCCG 


CTTGGGCAGA 


GGATCCGCCG 


GGCCTTCATC 

■ ^ »* pW p^ *i r i -ph pL X_p« 


1800 


GCCGAGGAGG 


GGTGGCTATT 


GGTGGCCCTG 


GACTATAGCC 


AGATAGAGCT 


CAGGGTGCTG 


1860 

X> U U V 


GCCCACCTCT 


CCGGCGACGA 


GAACCTGATC 


CGGGTCTTCC 


-T^.\-J VJ -irl\>3 \j \j \j 

* 


PPAPATPPAP* 


T Q 0 n 


ACGGAGACCG 


CCAGCTGGAT 


GTTCGGCGTC 


w V— \— VJI \J \J£-1.\J \J 






T O O /~\ 


CGGGCGGCCA 


AGACCATCAA 


CTTCGGGGTC 








04 0 


CAGGAGCTAG 


CCATCCCTTA 


CGAGGAGGCC 


CAGGCCTTPA 


TTG A Pf^P T 2t 


PTTTPSPBPP 


0 n n n 


TTCCCCAAGG 


TGCGGGCCTG 


GATTGAGAAG 


AC C C TGG AGG 






ZlOU 


GTGGAGACCC 

^tt^V ^pPBT VbBbVPBp BBfc^dBP^hi ^Bh ^N^* ^haP* ^apB* 


TCTTCGGCCG 

™"^B^ p^fc ^pp*l ^bbbI ^ppv ^ta^ ^pap* ^^^r 


CCGCCGCTAC 


GTGCCAGACC 


TAGAGGGGPG 


f^PiT'P A A f3 A fl P 1 


noon 
^ ^ U 


GTGCGGGAGG 

BBBl ^%p> ^NB' ^»»P> ^hB* BB*BH> 


CGGCCGAGCG 


CATGGCCTTC 


AACATGCCCG 

•npTAWpTpi ^ w \w» 


TPPGGGGPA P 


Pr3PPf^!PP/*lA P 1 




CTCATGAAGC 


TGGCTATGGT 


GAAGCTCTTC 


CCCAGGCTGG 


AGGAAATGGG 


GGCCAGGATG 


2340 


CTCCTTCAGG 


TCCACGACGA 


GCTGGTCCTC 


GAGGCCCCAA 


AAGAGAGGGC 


GGAGGCCGTG 


2400 


GCCCGGCTGG 


CCAAGGAGGT 


CATGGAGGGG 


GTGTATCCCC 


TGGCCGTGCC 


CCTGGAGGTG 


2460 


GAGGTGGGGA 


TAGGGGAGGA 


CTGGCTCTCC 


GCCAAGGAGT 


GA 




2502 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

GATTTAGGTG ACACTATAG 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
CGGACGAACA AGCGAGACAG CGACACAGGT ACCACATGGT ACAAGAGGCA AGAGAGACGA 6 0 



72 



C AC AG C AG AA AC 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GTTTCTGCTG TGTCGTCTCT CTTGCCTCTT GTACCATGTG GTACCTGTGT CGCTGTCTCG 60 



70 



CTTGTTCGTC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GACGAACAAG CGAGACAGCG 2 0 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
GTTTCTGCTG TGTCGTCTCT CTTG 24 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CCTCTTGTAC CATGTGGTAC CTGTGTCGCT GTCTCGCTTG TTCGTC 46 
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(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
ACACAGGTAC CACATGGTAC AAGAGGCAAG AGAGACGACA CAGCAGAAAC 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg lie Asn Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



ATGGCTAGCA 


TGACTGGTGG 


ACAGCAAATG 


GGTCGGATCA 


ATTCGGGGAT 


GCTGCCCCTC 


60 


TTTGAGCCCA 


AGGGCCGGGT 


CCTCCTGGTG 


GACGGCCACC 


ACCTGGCCTA 


CCGCACCTTC 


120 


CACGCCCTGA 


AGGGCCTCAC 


CACCAGCCGG 


GGGGAGCCGG 


TGCAGGCGGT 


CTACGGCTTC 


180 


GCCAAGAGCC 


TCCTCAAGGC 


CCTCAAGGAG 


GACGGGGACG 


CGGTGATCGT 


GGTCTTTGAC 


240 


GCCAAGGCCC 


CCTCCTTCCG 


CCACGAGGCC 


TACGGGGGGT 


ACAAGGCGGG 


CCGGGCCCCC 


300 


ACGCCGGAGG 


ACTTTCCCCG 


GCAACTCGCC 


CTCATCAAGG 


AGCTGGTGGA 


CCTCCTGGGG 


360 


CTGGCGCGCC 


TCGAGGTCCC 


GGGCTACGAG 


GCGGACGACG 


TCCTGGCCAG 


CCTGGCCAAG 


420 


AAGGCGGAAA 


AGGAGGGCTA 


CGAGGTCCGC 


ATCCTCACCG 


CCGACAAAGA 


CCTTTACCAG 


480 


CTTCTTTCCG 


ACCGCATCCA 


CGTCCTCCAC 


CCCGAGGGGT 


ACCTCATCAC 


CCCGGCCTGG 


540 


CTTTGGGAAA 


AGTACGGCCT 


GAGGCCCGAC 


CAGTGGGCCG 


ACTACCGGGC 


CCTGACCGGG 


600 


GACGAGTCCG 


ACAACCTTCC 


CGGGGTCAAG 


GGCATCGGGG 


AGAAGACGGC 


GAGGAAGCTT 


660 


CTGGAGGAGT 


GGGGGAGCCT 


GGAAGCCCTC 


CTCAAGAACC 


TGGACCGGCT 


GAAGCCCGCC 


720 


ATCCGGGAGA 


AGATCCTGGC 


CCACATGGAC 


GATCTGAAGC 


TCTCCTGGGA 


CCTGGCCAAG 


780 
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GTGCGCACCG ACCTGCCCCT GGAGGTGGAC TTCGCCAAAA GGCGGGAGCC CGACCGGGAG 84 0 

AGGCTTAGGG CCTTTCTGGA GAGGCTTGAG TTTGGCAGCC TCCTCCACGA GTTCGGCCTT 9 00 

GTGGAAAGCC CCAAGTCATG GAGGGGGTGT ATCCCCTGGC CGTGCCCCTG GAGGTGGAGG 96 0 

TGGGGATAG 969 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



ATGGCTAGCA 










CjL-TCsCCCCTC 


6 0 










AL.L. 1 (jtjUC 1 A 


CCGCACCTTC 


12 0 


CACGCCCTGA 


AGGGCCTCAC 


CACCAGCCGG 


GGGGAGCCGG 


TGCAGGCGGT 


CTACGGCTTC 


180 


GCCAAGAGCC 


TCCTCAAGGC 


CCTCAAGGAG 


GACGGGG AC G 


CGGTG AT CGT 


GGTCTTTGAC 


240 


GCCAAGGCCC 


CCTCCTTCCG 


CCACGAGGCC 


TACGGGGGGT 


ACAAGGCGGG 


CCGGGCCCCC 


300 


ACGCCGGAGG 


ACTTTCCCCG 


GCAACTCGCC 


CTCATCAAGG 


AGCTGGTGGA 


CCTCCTGGGG 


360 


CTGGCGCGCC 


TCGAGGTCCC 


GGGCTACGAG 


GCGGACGACG 


TCCTGGCCAG 


CCTGGCCAAG 


420 


AAGGCGGAAA 


AGGAGGGCTA 


CGAGGTCCGC 


ATCCTCACCG 


CCGACAAAGA 


CCTTTACCAG 


480 


CTTCTTTCCG 


ACCGCATCCA 


CGTCCTCCAC 


CCCGAGGGGT 


ACCTCATCAC 


CCCGGCCTGG 


540 


CTTTGGGAAA 


AGTACGGCCT 


GAGGCCCGAC 


CAGTGGGCCG 


ACTACCGGGC 


CCTGACCGGG 


600 


GACGAGTCCG 


ACAACCTTCC 


CGGGGTCAAG 


GGCATCGGGG 


AGAAGACGGC 


GAGGAAGCTT 


660 


CTGGAGGAGT 


GGGGGAGCCT 


GGAAGCCCTC 


CTCAAGAACC 


TGGACCGGCT 


GAAGCCCGCC 


720 


ATCCGGGAGA 


AGATCCTGGC 


CCACATGGAC 


GATCTGAAGC 


TCTCCTGGGA 


CCTGGCCAAG 


780 


GTGCGCACCG 


ACCTGCCCCT 


GGAGGTGGAC 


TTCGCCAAAA 


GGCGGGAGCC 


CGACCGGGAG 


840 


AGGCTTAGGG 


CCTTTCTGGA 


GAGGCTTGAG 


TTTGGCAGCC 


TCCTCCACGA 


GTTCGGCCTT 


900 


CTGGAAAGCC 


CCAAGGCCGC 


ACTCGAGCAC 


CACCACCACC 


ACCACTGA 




948 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CGCCAGGGTT TTCCCAGTCA CGACGTTGTA AAACGACGGC CAGTGAATTG TAATACGACT 
CACTATAGGG CGAATTCGAG CTCGGTACCC GGGGATCCTC TAGAGTCGAC CTGCAGGCAT 
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GCAAGCTTGA GTATT CTATA GTGTCACCTA AATAGCTTGG CGTAATCATG GTCATAGCTG 
TTTCCTGTGT GAAATTGTTA TCCGCT 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TTCTGGGTTC TCTGCTCTCT GGTCGCTGTC TCGCTTGTTC GTC 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

( C ) S TRANDEDNES S : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
GCTGTCTCGC TTGTTCGTC 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GACGAACAAG CGAGACAGCG 
(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 
TTCTGGGTTC TCTGCTCTCT GGTC 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 7 : 
GACGAACAAG CGAGACAGCG ACCAGAGAGC AGAGAACCCA GAA 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
ACCAGAGAGC AGAGAACCCA GAA 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
AACAGCTATG ACCATGATTA C 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GGATCCTCTA GAGTCGACCT GCAGGCATGC 



1 
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CLAIMS 

We claim: 

1 . A DNA sequence encoding a thermostable DNA polymerase altered 
in sequence relative to the native sequence such that it exhibits altered DNA 

5 synthetic activity from that of the native DNA polymerase. 

2. The DNA sequence of Claim 1 wherein the encoded DNA 
polymerase exhibits reduced synthetic activity from that of the native DNA 
polymerase. 

3. The DNA sequence of Claim 1 wherein the alteration to said native 
10 sequence comprises a change in a single nucleotide. 

4. The DNA sequence of Claim 1 wherein the alteration to said native 
sequence comprises a deletion. 

5. The DNA sequence of Claim 4 comprising a DNA sequence selected 
from the group consisting of SEQ ID NOS:9-12 and 21. 

15 6. The DNA sequence of Claim 1 wherein the alteration to said native 

sequence comprises an insertion. 

7. The DNA sequence of Claim 2 derived from an organism selected 
from the group consisting of Thermus aquaticus, Thermus flavus and Thermus 
thermophilics. 

20 8. A recombinant DNA vector comprising a DNA sequence encoding a 

thermostable DNA polymerase altered in sequence relative to the native sequence 
such that it exhibits altered DNA synthetic activity from that of the native DNA 
polymerase. 
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9. The recombinant DNA vector comprising a DNA sequence of 
Claim 8 wherein the alteration to said native sequence comprises a change in a 
single nucleotide. 

10. The recombinant DNA vector comprising a DNA sequence of 
5 Claim 8 wherein the alteration to said native sequence comprises a deletion. 

1 1 . The recombinant DNA vector comprising a DNA sequence of 
Claim 8 derived from an organism selected from the group consisting of Thermus 
aquaticus, Thermus flavus and Thermus thermophilus . 

12. The recombinant DNA vector comprising a DNA sequence of 

10 Claim 11 comprising a DNA sequence selected from the group consisting of SEQ 

ID NOS:9-12 and 21. 

13. A host cell transformed with the recombinant vector of Claim 8. 

14. A thermostable DNA polymerase altered in amino acid sequence 
such that it exhibits altered DNA synthetic activity from that of the native DNA 

15 polymerase but retains substantially the same 5' nuclease activity of that of the 

native DNA polymerase. 

15. The polymerase of Claim 14 wherein the altered polymerase exhibits 
reduced synthetic activity from that of the native DNA -polymerase. 

1 6. The polymerase of Claim 1 5 wherein the alteration to said native 
20 sequence comprises a change in an amino acid. 

17. The polymerase of Claim 15 wherein the alteration to said native 
sequence comprises a deletion. 
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18. The polymerase of Claim 15 derived from an organism selected from 
the group consisting of Thermus aquaticus, Thermus flavus and Thermus 
thermophilus. 



19. The polymerase of Claim 18 comprising an amino acid sequence 
encoded by the nucleic acid sequences selected from the group consisting of SEQ 
ID NOS:9-12 and 21. 



20. A method of detecting the presence of a specific target DNA 
molecule comprising: 

a) providing: 

10 i) a target nucleic acid, 

ii) a first oligonucleotide complementary to a first portion 
of said target nucleic acid, and 

iii) a second oligonucleotide, a region of which is 
complementary to a second portion of said target nucleic acid, said 

15 non-complementary region of said second oligonucleotide providing 

a single-stranded arm at its 5' end; 

b) mixing said target nucleic acid, said first oligonucleotide and 
said second oligonucleotide under conditions wherein said first 
oligonucleotide and the 3' end of said second oligonucleotide are annealed 

20 to said target DNA sequence so as to create a first cleavage structure; 

c) providing a cleavage means under conditions such that 
cleavage of said first cleavage structure occurs preferentially at a site 
located within said second oligonucleotide in a manner dependent upon the 
annealing of said first and second oligonucleotides on said target nucleic 

25 acid, thereby liberating the single-stranded arm of said second 

oligonucleotide generating a third oligonucleotide; 

d) providing a first hairpin structure having a single-stranded 3' 
arm and a single-stranded 5' arm under conditions wherein said third 
oligonucleotide anneals to said single-stranded 3' arm of said first hairpin 

30 thereby creating a second cleavage structure; 
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e) providing conditions under which cleavage of said second 
cleavage structure occurs by said cleavage means liberating the single- 
stranded 5' arm of said second cleavage structure so as to create reaction 
products comprising a fourth oligonucleotide and a first cleaved hairpin 

5 detection molecule; 

f) providing a second hairpin structure having a single-stranded 
3' arm and a single-stranded 5' arm under conditions wherein said fourth 
oligonucleotide anneals to the single-stranded 3' arm of said second hairpin 
thereby creating a third cleavage structure; 

10 g) providing conditions under which cleavage of said third 

cleavage structure occurs by said cleavage means liberating the single- 
stranded 5' arm of said third cleavage structure so as to create reaction 
products comprising a fifth oligonucleotide identical in sequence to said 
third oligonucleotide and a second cleaved hairpin detection molecule; and 

15 h) detecting the presence of said first and second cleaved hairpin 

detection molecules. 



2 1 . The method of Claim 20 wherein steps d) through g) are repeated at 
least once. 



22. The method of Claim 20 wherein said cleavage means comprises an 
20 altered thermostable DNA polymerase such that cleavage reactions occur in the 

absence of any significant polymerase activity. 



23. The method of Claim 20 wherein the cleavage reactions of steps c), 
e) and g) will not occur absent the annealing of said first oligonucleotide, said third 
oligonucleotide and said fourth oligonucleotide, respectively. 



25 24. The method of Claim 20 wherein the cleavage reaction of step c) 

occurs within the annealed portion of said second oligonucleotide. 
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25. The method of Claim 20 wherein the cleavage reaction of step c) 
occurs within the non-annealed portion of said second oligonucleotide. 

26. A method of detecting the presence of a specific target nucleic acid 
molecule comprising: 

5 a) providing: 

i) a cleavage means, 

ii) a target nucleic acid, 

iii) a first oligonucleotide complementary to a first portion 
of said target nucleic acid, 

1 0 iv) a first solid support having a second oligonucleotide, a 

region of which is complementary to a second portion of said target 
nucleic acid, said non-complementary region of said second 
oligonucleotide providing a single-stranded arm at its 5' end, a 
portion of said 5 ' arm comprising a first signal oligonucleotide, 

15 v) a plurality of uncleaved second solid supports each 

having a third oligonucleotide, a region of which is complementary 
to said first signal oligonucleotide, the non-complementary region of 
said third oligonucleotide providing a single-stranded arm at its 5' 
end, a portion of said 5 ' arm comprising a second signal 
20 oligonucleotide, and 

vi) a plurality of uncleaved third solid supports each 
having a fourth oligonucleotide, a region of which is complementary 
to said second signal oligonucleotide, the non-complementary region 
of said fourth oligonucleotide providing a single-stranded arm at its 
25 5' end, a portion of said 5' arm comprising said first signal 

oligonucleotide; 

b) mixing said cleavage means, said target nucleic acid, said 
first oligonucleotide and said second oligonucleotide under conditions 
wherein said first oligonucleotide and the 3 ' end of said second 
30 oligonucleotide are annealed to said target DNA sequence so as to create a 
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first cleavage structure and cleavage of said first cleavage structure results 
in the liberating of said first signal oligonucleotide; 

c) reacting said liberated first signal oligonucleotide with one of 
said plurality of second solid supports under conditions such that said first 

5 signal oligonucleotide hybridizes to said complementary region of said third 

oligonucleotide to create a second cleavage structure and cleavage of said 
second cleavage structure results in the liberating of said second signal 
oligonucleotide and a cleaved second solid support; 

d) reacting said liberated second signal oligonucleotide with one 
10 of said plurality of third solid supports under conditions such that said 

second signal oligonucleotide hybridizes to said complementary region of 
said fourth oligonucleotide to create a third cleavage structure and cleavage 
of said third cleavage structure results in the liberating of a second molecule 
of said first signal oligonucleotide and a cleaved third solid support; and 
15 e) detecting the presence of said first and second signal 

oligonucleotides. 



27. The method of Claim 26 wherein said cleavage means comprises a 
5' nuclease derived from a thermostable DNA polymerase. 

28. The method of Claim 27 wherein said thermostable DNA polymerase 
20 is derived from an organism selected from the group consisting of Thermus 

aquaticus, Thermus flavus and Thermus thermophilus. 

29. The method of Claim 28 wherein said 5' nuclease is encoded by a 
DNA sequence selected from the group consisting of SEQ ID NOS:ll, 30 and 31. 
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